[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-4273: --- Resolution: Won't Fix Target Version/s: 2.0.3-alpha, 3.0.0 (was: 3.0.0, 2.0.3-alpha) Status: Resolved (was: Patch Available) I looked into tests added by .v8 patch. {{TestDFSClientRetries#testDFSInputStreamReadRetryTime}} added by .v8 patch. The test expects client to always retry up to maxBlockAcquireFailures but it is not true. Client does not retry to same node on ChecksumException. {{seekToNewSource}} returning 0 means there is no more possible datanodes and it is right to give up even if retry count (failures) does not reache to max. {{testSeekToNewSourcePastFileSize}} and {{testNegativeSeekToNewSource}} added to {{TestSeekBug}} calls {{FSDataInpuStream#seekToNewSource}} just after opening file. This causes NullPointerException because currentNode is not set in DFSInputstream. Tests passed after fixing this. I close this issue as Won't fix. Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-4273: --- Labels: (was: BB2015-05-TBR) Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-4273: --- Labels: BB2015-05-TBR (was: ) Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Labels: BB2015-05-TBR Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4273: Attachment: HDFS-4273.v8.patch Update patch to remove expiring local deadNodes related changes. Will create another jira addressing it. Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4273: Summary: Fix some issue in DFSInputstream (was: Problem in DFSInputStream read retry logic may cause early failure) Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, TestDFSInputStream.java Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} some issues of the logic: 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle. 2. the variable int retries=2 in readWithStrategy seems have conflict with MaxBlockAcquireFailures, should it be removed? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4273: Description: Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. was: Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} some issues of the logic: 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle. 2. the variable int retries=2 in readWithStrategy seems have conflict with MaxBlockAcquireFailures, should it be removed? Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, TestDFSInputStream.java Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4273: Description: Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. was: Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, TestDFSInputStream.java Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4273) Fix some issue in DFSInputstream
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-4273: Description: Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. was: Follow issues in DFSInputStream is address in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. Fix some issue in DFSInputstream Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, HDFS-4273.v7.patch, TestDFSInputStream.java Following issues in DFSInputStream are addressed in this jira: 1. read may not retry enough in some cases cause early failure Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race condition, it is cleared to 0 when it is still used by other thread. So it is possible that some read thread may never quit. Change failures to local variable solve this issue. 3. If local datanode is added to deadNodes, it will not be removed from deadNodes if DN is back alive. We need a way to remove local datanode from deadNodes when the local datanode is become live. -- This message was sent by Atlassian JIRA (v6.1.5#6160)