[jira] [Commented] (HDFS-15924) Log4j will cause Server handler blocked when audit log boom.
[ https://issues.apache.org/jira/browse/HDFS-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309423#comment-17309423 ] HuangTao commented on HDFS-15924: - Which Hadoop version used in your cluster? > Log4j will cause Server handler blocked when audit log boom. > > > Key: HDFS-15924 > URL: https://issues.apache.org/jira/browse/HDFS-15924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Qi Zhu >Priority: Major > Attachments: image-2021-03-26-16-18-03-341.png, > image-2021-03-26-16-19-42-165.png > > > !image-2021-03-26-16-18-03-341.png|width=707,height=234! > !image-2021-03-26-16-19-42-165.png|width=824,height=198! > The thread blocked when audit log boom show in above. > Such as [https://dzone.com/articles/log4j-thread-deadlock-case] , it seems > the same case when heavy load, should we update to Log4j2 or other things we > can do to improve it in heavy audit log. > > {code:java} > /** > Call the appenders in the hierrachy starting at > this. If no appenders could be found, emit a > warning. > This method calls all the appenders inherited from the > hierarchy circumventing any evaluation of whether to log or not > to log the particular log request. > @param event the event to log. */ > public void callAppenders(LoggingEvent event) { > int writes = 0; > for(Category c = this; c != null; c=c.parent) { > // Protected against simultaneous call to addAppender, > removeAppender,... > synchronized(c) { > if(c.aai != null) { > writes += c.aai.appendLoopOnAppenders(event); > } > if(!c.additive) { > break; > } > } > } > if(writes == 0) { > repository.emitNoAppenderWarning(this); > } > }{code} > The log4j code, use the global synchronized, it will cause this happened. > cc [~weichiu] [~hexiaoqiao] [~ayushtkn] [~shv] [~ferhui] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15739) Missing Javadoc for a param in DFSNetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-15739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252584#comment-17252584 ] HuangTao commented on HDFS-15739: - +1 LGTM about the PR > Missing Javadoc for a param in DFSNetworkTopology > - > > Key: HDFS-15739 > URL: https://issues.apache.org/jira/browse/HDFS-15739 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: zhanghuazong >Assignee: zhanghuazong >Priority: Minor > Labels: pull-request-available > Attachments: HDFS-15739.0.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Only add missing Javadoc for a param in method chooseRandomWithStorageType > of DFSNetworkTopology.java. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244379#comment-17244379 ] HuangTao commented on HDFS-15240: - I have check and run the failed unit test TestReconstructStripedFile.testNNSendsErasureCodingTasks in branch-3.1, it can pass locally > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Blocker > Fix For: 3.4.0, 3.2.3 > > Attachments: HDFS-15240-branch-3.1-001.patch, > HDFS-15240-branch-3.2.001.patch, HDFS-15240-branch-3.3-001.patch, > HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, > HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, > HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, > HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, > HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.co
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244245#comment-17244245 ] HuangTao commented on HDFS-15240: - Upload [^HDFS-15240-branch-3.1-001.patch] for branch-3.1 and [^HDFS-15240-branch-3.3-001.patch] for branch-3.3. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Blocker > Fix For: 3.4.0, 3.2.3 > > Attachments: HDFS-15240-branch-3.1-001.patch, > HDFS-15240-branch-3.2.001.patch, HDFS-15240-branch-3.3-001.patch, > HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, > HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, > HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, > HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, > HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExec
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240-branch-3.1-001.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Blocker > Fix For: 3.4.0, 3.2.3 > > Attachments: HDFS-15240-branch-3.1-001.patch, > HDFS-15240-branch-3.2.001.patch, HDFS-15240-branch-3.3-001.patch, > HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, > HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, > HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, > HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, > HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Threa
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240-branch-3.3-001.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Blocker > Fix For: 3.4.0, 3.2.3 > > Attachments: HDFS-15240-branch-3.1-001.patch, > HDFS-15240-branch-3.2.001.patch, HDFS-15240-branch-3.3-001.patch, > HDFS-15240.001.patch, HDFS-15240.002.patch, HDFS-15240.003.patch, > HDFS-15240.004.patch, HDFS-15240.005.patch, HDFS-15240.006.patch, > HDFS-15240.007.patch, HDFS-15240.008.patch, HDFS-15240.009.patch, > HDFS-15240.010.patch, HDFS-15240.011.patch, HDFS-15240.012.patch, > HDFS-15240.013.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Threa
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242388#comment-17242388 ] HuangTao commented on HDFS-15240: - Thanks [~tasanuma] for your careful review. I address all your comments in [^HDFS-15240.013.patch] . > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.013.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, HDFS-15240.013.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240439#comment-17240439 ] HuangTao commented on HDFS-15240: - Upload [^HDFS-15240.012.patch] to fix flaky test case. The test cases can't pass with XOR-2-1 erasure coding, because these two cases need to allow at least two blocks to be broken. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPo
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.012.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > HDFS-15240.012.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the fut
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.011.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, HDFS-15240.011.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239546#comment-17239546 ] HuangTao commented on HDFS-15240: - Upload [^HDFS-15240.010.patch] to add a new function clearFuturesAndService to remove all stale futures from readService and clear futures. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.010.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, HDFS-15240.010.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared
[jira] [Comment Edited] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239090#comment-17239090 ] HuangTao edited comment on HDFS-15240 at 11/26/20, 11:14 AM: - Upload [^HDFS-15240.009.patch] to address {quote}Either we have to make sure all stale futures removed from service when clearing the futures. Otherwise in the current code we are assuming service#poll will give one of the submitted current future only. {quote} Thanks [~umamaheswararao] for that nice advice, this make code more clear. And thanks [~ferhui] for offline discussion. However, I'm a little worried about whether the below code slow down the reconstruction if the future can't finish quickly. {code:java} while (!futures.isEmpty()) { try { Future future = readService.poll( stripedReadTimeoutInMills, TimeUnit.MILLISECONDS); futures.remove(future); } catch (InterruptedException e) { } } {code} was (Author: marvelrock): Upload [^HDFS-15240.009.patch] to address {quote}Either we have to make sure all stale futures removed from service when clearing the futures. Otherwise in the current code we are assuming service#poll will give one of the submitted current future only. {quote} Thanks [~umamaheswararao] for that nice advice, this make code more clear. And thanks [~ferhui] for offline discussion. However, I'm a little worried about whether the below code slow down the reconstruction if the future can't finish. {code:java} while (!futures.isEmpty()) { try { Future future = readService.poll( stripedReadTimeoutInMills, TimeUnit.MILLISECONDS); futures.remove(future); } catch (InterruptedException e) { } } {code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017]
[jira] [Comment Edited] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239090#comment-17239090 ] HuangTao edited comment on HDFS-15240 at 11/26/20, 6:50 AM: Upload [^HDFS-15240.009.patch] to address {quote}Either we have to make sure all stale futures removed from service when clearing the futures. Otherwise in the current code we are assuming service#poll will give one of the submitted current future only. {quote} Thanks [~umamaheswararao] for that nice advice, this make code more clear. And thanks [~ferhui] for offline discussion. However, I'm a little worried about whether the below code slow down the reconstruction if the future can't finish. {code:java} while (!futures.isEmpty()) { try { Future future = readService.poll( stripedReadTimeoutInMills, TimeUnit.MILLISECONDS); futures.remove(future); } catch (InterruptedException e) { } } {code} was (Author: marvelrock): Upload [^HDFS-15240.009.patch] to address ??Either we have to make sure all stale futures removed from service when clearing the futures. Otherwise in the current code we are assuming service#poll will give one of the submitted current future only.?? Thanks [~umamaheswararao] for that nice advice, this make code more clear. And thanks [~ferhui] for offline discussion. However, I'm a little worried about whether the below code slow down the reconstruction if the future can't finish. {code:java} while (!futures.isEmpty()) { try { Future future = readService.poll( stripedReadTimeoutInMills, TimeUnit.MILLISECONDS); futures.remove(future); } catch (InterruptedException e) { } } {code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted whil
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239090#comment-17239090 ] HuangTao commented on HDFS-15240: - Upload [^HDFS-15240.009.patch] to address ??Either we have to make sure all stale futures removed from service when clearing the futures. Otherwise in the current code we are assuming service#poll will give one of the submitted current future only.?? Thanks [~umamaheswararao] for that nice advice, this make code more clear. And thanks [~ferhui] for offline discussion. However, I'm a little worried about whether the below code slow down the reconstruction if the future can't finish. {code:java} while (!futures.isEmpty()) { try { Future future = readService.poll( stripedReadTimeoutInMills, TimeUnit.MILLISECONDS); futures.remove(future); } catch (InterruptedException e) { } } {code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconst
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.009.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > HDFS-15240.009.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239039#comment-17239039 ] HuangTao commented on HDFS-15240: - The NPE fails the reconstruction and close the striped readers(StripedBlockReconstructor Thread). In the meanwhile, the striped reader threads are running asynchronously. {code:java} reconstructor.freeBuffer(reader.getReadBuffer()){code} The above line of code just puts the buffer into BUFFER_POOL, the striped reader still holds the buffer. So the striped reader may store dirty data in that buffer before calling to the below line {code:java} reader.freeReadBuffer() {code} I had written a special UT `testAbnormallyCloseDoesNotWriteBufferAgain` to indicate how the dirty buffer is generated without any fix. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockRe
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232608#comment-17232608 ] HuangTao commented on HDFS-15240: - Upload [^HDFS-15240.008.patch] for fixing checkstyle. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that cont
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.008.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, HDFS-15240.008.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkRead
[jira] [Comment Edited] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232477#comment-17232477 ] HuangTao edited comment on HDFS-15240 at 11/16/20, 3:01 AM: Thanks [~ferhui] [~umamaheswararao] for your comments. I upload HDFS-15240.007.patch for addressing your comments. With regard to StripingChunkReadResult.OUTDATED, I think we should keep it, which can avoid scheduling an useless block reader. About reader.closeBlockReader(), {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); // thread may stop here, before closeBlockReader, StripedBlockReader may write data into the released buffer. reader.freeReadBuffer(); reader.closeBlockReader();{code} I add UT `testAbnormallyCloseDoesNotWriteBufferAgain` which can detect the dirty buffer without any fix. In addition, I add {code:java} entry.getValue().clear();{code} in ElasticByteBufferPool#getBuffer to ensure the buffer is clear before return. was (Author: marvelrock): Thanks [~ferhui] [~umamaheswararao] for your comments. I upload HDFS-15240.007.patch for addressing your comments. With regard to StripingChunkReadResult.OUTDATED, I think we should keep it, which can avoid scheduling an useless block reader. About reader.closeBlockReader(), I add UT `testAbnormallyCloseDoesNotWriteBufferAgain`. {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); // thread may stop here, before closeBlockReader, StripedBlockReader may write data into the released buffer. reader.freeReadBuffer(); reader.closeBlockReader();{code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout lef
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232477#comment-17232477 ] HuangTao commented on HDFS-15240: - Thanks [~ferhui] [~umamaheswararao] for your comments. I upload HDFS-15240.007.patch for addressing your comments. With regard to StripingChunkReadResult.OUTDATED, I think we should keep it, which can avoid scheduling an useless block reader. About reader.closeBlockReader(), I add UT `testAbnormallyCloseDoesNotWriteBufferAgain`. {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); // thread may stop here, before closeBlockReader, StripedBlockReader may write data into the released buffer. reader.freeReadBuffer(); reader.closeBlockReader();{code} > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Ex
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.007.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, HDFS-15240.007.patch, > image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225810#comment-17225810 ] HuangTao commented on HDFS-15240: - Thanks for your suggestion, [~ferhui] I will change that later. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {co
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17224396#comment-17224396 ] HuangTao commented on HDFS-15240: - Thanks for your work, [~touchida]. I checked again and got the same result as you. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, image-2020-07-16-15-56-38-608.png, > org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt, > org.apache.hadoop.hdfs.TestReconstructStripedFile.txt, > test-HDFS-15240.006.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futur
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.006.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > HDFS-15240.006.patch, image-2020-07-16-15-56-38-608.png > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > EDITED for readability. > {code:java} > decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 4 > Check the first 131072 bytes between block[6] and block[6'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4 > decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] > Check the first 131072 bytes between block[1] and block[1'], the longest > common substring length is 65536 > CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest > common substring length is 27197440 # this one > Check the first 131072 bytes between block[7] and block[7'], the longest > common substring length is 4 > Check the first 131072 bytes between block[8] and block[8'], the longest > common substring length is 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length - 64kb. It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the dirty buffer was used before read the 1th block. Must be noted that StripedBlockReader read from the offset 0 of the 1th block after used the dirty buffer. EDITED for readability. {code:java} decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 4 Check the first 131072 bytes between block[6] and block[6'], the longest common substring length is 4 Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4 decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 65536 CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest common substring length is 27197440 # this one Check the first 131072 bytes between block[7] and block[7'], the longest common substring length is 4 Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4{code} Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about? After digging into the code and DN log, I found this following DN log is the root reason. {code:java} [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/:52586 remote=/:50010]. 18 millis timeout left. [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632--1519726836856:blk_-YY_3472979393 java.lang.NullPointerException at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared, {code:java} return new StripingChunkReadResult(futures.remove(future), StripingChunkReadResult.CANCELLED); {code} futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in *getStripedReader().close()* {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); reader.freeReadBuffer(); reader.closeBlockReader(); {code} free buffer firstly, but the StripedBlockReader still holds the buffer and write it, that pollute the buffer of BufferPool. was: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the l
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length - 64kb. It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the dirty buffer was used before read the 1th block. Must be noted that StripedBlockReader read from the offset 0 of the 1th block after used the dirty buffer. EDITED for readability. {code:java} decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8'] Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 4 Check the first 131072 bytes between block[6] and block[6'], the longest common substring length is 4 Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4 decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8'] Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 65536 CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest common substring length is 27197440 # this one Check the first 131072 bytes between block[7] and block[7'], the longest common substring length is 4 Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4{code} Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about? After digging into the code and DN log, I found this following DN log is the root reason. {code:java} [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/:52586 remote=/:50010]. 18 millis timeout left. [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632--1519726836856:blk_-YY_3472979393 java.lang.NullPointerException at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared, {code:java} return new StripingChunkReadResult(futures.remove(future), StripingChunkReadResult.CANCELLED); {code} futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in *getStripedReader().close()* {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); reader.freeReadBuffer(); reader.closeBlockReader(); {code} free buffer firstly, but the StripedBlockReader still holds the buffer and write it. was: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlo
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179461#comment-17179461 ] HuangTao commented on HDFS-15240: - No problem, [~ferhui] I will work on that later > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > image-2020-07-16-15-56-38-608.png > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159804#comment-17159804 ] HuangTao commented on HDFS-15240: - Could you please help me review that ? [~weichiu] I think this is a very critical bug. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > image-2020-07-16-15-56-38-608.png > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReade
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Target Version/s: 3.4.0 > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, > image-2020-07-16-15-56-38-608.png > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Commented] (HDFS-15140) Replace FoldedTreeSet in Datanode with SortedSet or TreeMap
[ https://issues.apache.org/jira/browse/HDFS-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083996#comment-17083996 ] HuangTao commented on HDFS-15140: - about the patch, I find two points need to change {code} ReplicaInfo orig = b.getOriginalReplica(); - builders.get(volStorageID).add(orig); + unsortedBlocks.get(volStorageID).add(b); # replace b with orig {code} {code} +unsortedBlocks.put( +v.getStorageID(), new ArrayList(131072)); # should use a constant variable {code} > Replace FoldedTreeSet in Datanode with SortedSet or TreeMap > --- > > Key: HDFS-15140 > URL: https://issues.apache.org/jira/browse/HDFS-15140 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15140.001.patch, HDFS-15140.002.patch > > > Based on the problems discussed in HDFS-15131, I would like to explore > replacing the FoldedTreeSet structure in the datanode with a builtin Java > equivalent - either SortedSet or TreeMap. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082972#comment-17082972 ] HuangTao commented on HDFS-15274: - Thanks, [~hexiaoqiao]. There are still some failed tests, I will check them again > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch, HDFS-15274.002.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082766#comment-17082766 ] HuangTao commented on HDFS-15274: - cc [~hexiaoqiao] could you help me trigger jenkins again? > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch, HDFS-15274.002.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082348#comment-17082348 ] HuangTao commented on HDFS-15274: - All failed test cases can pass locally > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch, HDFS-15274.002.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082347#comment-17082347 ] HuangTao commented on HDFS-15274: - retrigger QA > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch, HDFS-15274.002.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15274: Attachment: HDFS-15274.002.patch > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch, HDFS-15274.002.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082132#comment-17082132 ] HuangTao commented on HDFS-15274: - [~daryn] please help to take a look. > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15274: Attachment: HDFS-15274.001.patch Status: Patch Available (was: Open) > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15274.001.patch > > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15274: Fix Version/s: (was: 3.4.0) > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
[ https://issues.apache.org/jira/browse/HDFS-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15274: Affects Version/s: 3.4.0 > NN doesn't remove the blocks from the failed DatanodeStorageInfo > > > Key: HDFS-15274 > URL: https://issues.apache.org/jira/browse/HDFS-15274 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > > In our federation cluster, we found there were some inconsistency failure > volumes between two namespaces. The following logs are two NS separately. > NS1 received the failed storage info and removed the blocks associated with > the failed storage. > {code:java} > [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes > from 0 to 1 > [INFO] [IPC Server handler 76 on 8021] : > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs > failed. > [INFO] > [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] > : Removed blocks associated with storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010 > [INFO] [IPC Server handler 73 on 8021] : Removed storage > [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs > from DataNode X.X.X.X:50010{code} > NS2 just received the failed storage. > {code:java} > [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes > from 0 to 1 {code} > > After digging into the code and trying to simulate disk failed with > {code:java} > echo offline > /sys/block/sda/device/state > echo 1 > /sys/block/sda/device/delete > # re-mount the failed disk > rescan-scsi-bus.sh -a > systemctl daemon-reload > mount /data0 > {code} > I found the root reason is the inconsistency between StorageReport and > VolumeFailureSummary in BPServiceActor#sendHeartBeat. > {code} > StorageReport[] reports = > dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); > .. > // the DISK may FAILED before executing the next line > VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() > .getVolumeFailureSummary(); > int numFailedVolumes = volumeFailureSummary != null ? > volumeFailureSummary.getFailedStorageLocations().length : 0; > {code} > I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo
HuangTao created HDFS-15274: --- Summary: NN doesn't remove the blocks from the failed DatanodeStorageInfo Key: HDFS-15274 URL: https://issues.apache.org/jira/browse/HDFS-15274 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: HuangTao Assignee: HuangTao Fix For: 3.4.0 In our federation cluster, we found there were some inconsistency failure volumes between two namespaces. The following logs are two NS separately. NS1 received the failed storage info and removed the blocks associated with the failed storage. {code:java} [INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes from 0 to 1 [INFO] [IPC Server handler 76 on 8021] : [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs failed. [INFO] [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] : Removed blocks associated with storage [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs from DataNode X.X.X.X:50010 [INFO] [IPC Server handler 73 on 8021] : Removed storage [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs from DataNode X.X.X.X:50010{code} NS2 just received the failed storage. {code:java} [INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes from 0 to 1 {code} After digging into the code and trying to simulate disk failed with {code:java} echo offline > /sys/block/sda/device/state echo 1 > /sys/block/sda/device/delete # re-mount the failed disk rescan-scsi-bus.sh -a systemctl daemon-reload mount /data0 {code} I found the root reason is the inconsistency between StorageReport and VolumeFailureSummary in BPServiceActor#sendHeartBeat. {code} StorageReport[] reports = dn.getFSDataset().getStorageReports(bpos.getBlockPoolId()); .. // the DISK may FAILED before executing the next line VolumeFailureSummary volumeFailureSummary = dn.getFSDataset() .getVolumeFailureSummary(); int numFailedVolumes = volumeFailureSummary != null ? volumeFailureSummary.getFailedStorageLocations().length : 0; {code} I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Fix Version/s: 3.4.0 > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.3.1 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Affects Version/s: (was: 3.3.1) > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Affects Version/s: (was: 3.3.0) 3.3.1 > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.3.1 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077745#comment-17077745 ] HuangTao edited comment on HDFS-15240 at 4/8/20, 2:10 AM: -- Hi watchers, anyone would like to help take a review? cc [~surendrasingh] please help to take a look. was (Author: marvelrock): Hi watchers, anyone would like to help take a review? > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > rea
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077745#comment-17077745 ] HuangTao commented on HDFS-15240: - Hi watchers, anyone would like to help take a review? > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by A
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Affects Version/s: 3.3.0 > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17075854#comment-17075854 ] HuangTao commented on HDFS-15240: - the failed test is unrelated with this patch. [~weichiu] PTAL, Thanks > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.005.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.004.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: h
[jira] [Resolved] (HDFS-15260) Change from List to Set to improve the chooseTarget performance
[ https://issues.apache.org/jira/browse/HDFS-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao resolved HDFS-15260. - Resolution: Invalid It is our custom version > Change from List to Set to improve the chooseTarget performance > --- > > Key: HDFS-15260 > URL: https://issues.apache.org/jira/browse/HDFS-15260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15260) Change from List to Set to improve the chooseTarget performance
[ https://issues.apache.org/jira/browse/HDFS-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15260: Attachment: (was: choose_target_flamegraph.png) > Change from List to Set to improve the chooseTarget performance > --- > > Key: HDFS-15260 > URL: https://issues.apache.org/jira/browse/HDFS-15260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15260) Change from List to Set to improve the chooseTarget performance
[ https://issues.apache.org/jira/browse/HDFS-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15260: Attachment: choose_target_flamegraph.png > Change from List to Set to improve the chooseTarget performance > --- > > Key: HDFS-15260 > URL: https://issues.apache.org/jira/browse/HDFS-15260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Attachments: choose_target_flamegraph.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15260) Change from List to Set to improve the chooseTarget performance
HuangTao created HDFS-15260: --- Summary: Change from List to Set to improve the chooseTarget performance Key: HDFS-15260 URL: https://issues.apache.org/jira/browse/HDFS-15260 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: HuangTao Assignee: HuangTao -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.003.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17075021#comment-17075021 ] HuangTao commented on HDFS-15240: - fix checkstyle and the unit case > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.002.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For ad
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073396#comment-17073396 ] HuangTao commented on HDFS-15240: - [~weichiu] I'm working on that, I will upload another patch later > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsu
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067871#comment-17067871 ] HuangTao commented on HDFS-15240: - I will try to add an UT > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@h
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length - 64kb. It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the dirty buffer was used before read the 1th block. Must be noted that StripedBlockReader read from the offset 0 of the 1th block after used the dirty buffer. {code:java} decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] Check Block(1) first 131072 bytes longest common substring length 4 Check Block(6) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4 decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] Check Block(1) first 131072 bytes longest common substring length 65536 CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length 27197440 # this one Check Block(7) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4{code} Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about? After digging into the code and DN log, I found this following DN log is the root reason. {code:java} [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/:52586 remote=/:50010]. 18 millis timeout left. [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632--1519726836856:blk_-YY_3472979393 java.lang.NullPointerException at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared, {code:java} return new StripingChunkReadResult(futures.remove(future), StripingChunkReadResult.CANCELLED); {code} futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in *getStripedReader().close()* {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); reader.freeReadBuffer(); reader.closeBlockReader(); {code} free buffer firstly, but the StripedBlockReader still holds the buffer and write it. was: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length - 64kb. It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the dirty buffer was used before read the 1th block. {code:java} decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] Check Block(1) first 131072 bytes longest common substring length 4 Check Block(6) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4 decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] Check Block(1) first 131072 bytes longest common substring length 65536 CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length 27197440 # this one Check Block(7) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4{code} Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about? After digging into the code and DN log, I found this following DN log is the root reason. {code:java} [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/:52586 remote=/:50010]. 18 millis timeout left. [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632--1519726836856:blk_-YY_3472979393 java.lang.NullPointerException at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared, {code:java} return new StripingChunkReadResult(futures.remove(future), StripingChunkReadResult.CANCELLED); {code} futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in *getStripedReader().close()* {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); reader.freeReadBuffer(); reader.closeBlockReader(); {code} free buffer firstly, but the StripedBlockReader still holds the buffer and write it. was: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k), and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6' and b6(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet is my check program output {code:java} decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] Check Block(1) first 131072 bytes longest common substring length 4 Check Block(6) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4 decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] Check Block(1) first 131072
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067539#comment-17067539 ] HuangTao commented on HDFS-15240: - [~weichiu] PTAL, Thanks > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k), and > choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the > longest common sequenece(LCS) between b6' and b6(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet is my check program output > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Attachment: HDFS-15240.001.patch > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Attachments: HDFS-15240.001.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k), and > choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the > longest common sequenece(LCS) between b6' and b6(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet is my check program output > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k), and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6' and b6(b7'/b7 and b8'/b8). After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer. The following log snippet is my check program output {code:java} decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] Check Block(1) first 131072 bytes longest common substring length 4 Check Block(6) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4 decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] Check Block(1) first 131072 bytes longest common substring length 65536 CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length 27197440 # this one Check Block(7) first 131072 bytes longest common substring length 4 Check Block(8) first 131072 bytes longest common substring length 4{code} Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about? After digging into the code and DN log, I found this following DN log is the root reason. {code:java} [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/:52586 remote=/:50010]. 18 millis timeout left. [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632--1519726836856:blk_-YY_3472979393 java.lang.NullPointerException at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared, {code:java} return new StripingChunkReadResult(futures.remove(future), StripingChunkReadResult.CANCELLED); {code} futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in *getStripedReader().close()* {code:java} reconstructor.freeBuffer(reader.getReadBuffer()); reader.freeReadBuffer(); reader.closeBlockReader(); {code} free buffer firstly, but the StripedBlockReader still holds the buffer and write it. was: When read some lzo files we found some blocks were broken. I read back all internal blocks of the block group(RS-6-3-1024k), and choose 6 blocks to decode other 3 block > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k), and > choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the > longest common sequenece(LCS) between b6' and b6(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet is my check program output > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6,
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Description: When read some lzo files we found some blocks were broken. I read back all internal blocks of the block group(RS-6-3-1024k), and choose 6 blocks to decode other 3 block > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > > When read some lzo files we found some blocks were broken. > I read back all internal blocks of the block group(RS-6-3-1024k), and choose > 6 blocks to decode other 3 block -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Summary: Erasure Coding: dirty buffer causes reconstruction block error (was: dirty buffer causes EC reconstruction block error) > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao reassigned HDFS-15240: --- Assignee: HuangTao > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15240) dirty buffer causes EC reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15240: Summary: dirty buffer causes EC reconstruction block error (was: dirty) > dirty buffer causes EC reconstruction block error > - > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15240) dirty
HuangTao created HDFS-15240: --- Summary: dirty Key: HDFS-15240 URL: https://issues.apache.org/jira/browse/HDFS-15240 Project: Hadoop HDFS Issue Type: Bug Components: datanode, erasure-coding Reporter: HuangTao -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case
[ https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040767#comment-17040767 ] HuangTao commented on HDFS-15186: - In our production, we meet same issue, and I have discussed with [~yaoguangdong] about this offline. [~weichiu] PTAL > Erasure Coding: Decommission may generate the parity block's content with all > 0 in some case > > > Key: HDFS-15186 > URL: https://issues.apache.org/jira/browse/HDFS-15186 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Yao Guangdong >Priority: Critical > Fix For: 3.3.0 > > > I can find some parity block's content with all 0 when i decommission some > DataNode(more than 1) from a cluster. And the probability is very big(parts > per thousand).This is a big problem.You can think that if we read data from > the zero parity block or use the zero parity block to recover a block which > can make us use the error data even we don't know it. > There is some case in the below: > B: Busy DataNode, > D:Decommissioning DataNode, > Others is normal. > 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)]. > 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)]. > > In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), > 7, 8(D)], the DN may received reconstruct block command and the > liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which in > the class StripedReconstructionInfo) length is 2. > The targets's length is 2 which mean that the DataNode need recover 2 > internal block in current code.But from the liveIndices we only can find 1 > missing block, so the method StripedWriter#initTargetIndices will use 0 as > the default recover block and don't care the indices 0 is in the sources > indices or not. > When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] > use the ec algorithm.We can find that the indices [0] is in the both the > sources indices and the targets indices in this case. The returned target > buffer in the indices [6] is always 0 from the ec algorithm.So I think this > is the ec algorithm's problem. Because it should more fault tolerance.I try > to fixed it .But it is too hard. Because the case is too more. The second is > another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to > recover indices [0, 6, 0]). So I changed my mind.Invoke the ec algorithm > with a correct parameters. Which mean that remove the duplicate target > indices 0 in this case.Finally, I fixed it in this way. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984093#comment-16984093 ] HuangTao commented on HDFS-15013: - Upload v002 patch. fix non-ha mode > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984072#comment-16984072 ] HuangTao edited comment on HDFS-15013 at 11/28/19 2:32 AM: --- [~surendrasingh] sorry, I didn't think of non-HA mode, I will fix it right away. [~ayushtkn] I just moved render() into setInterval(), which will check http request to /conf per 5 ms. If the request finishes, will call render() was (Author: marvelrock): [~surendrasingh] sorry, I didn't think of non-HA mode, I will fix it right away. [~ayushtkn] I just moved render() into setInterval() > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15013: Attachment: HDFS-15013.002.patch > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, HDFS-15013.002.patch, > image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15013: Fix Version/s: 3.3.0 > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984072#comment-16984072 ] HuangTao edited comment on HDFS-15013 at 11/28/19 1:41 AM: --- [~surendrasingh] sorry, I didn't think of non-HA mode, I will fix it right away. [~ayushtkn] I just moved render() into setInterval() was (Author: marvelrock): [~surendrasingh] [~ayushtkn] sorry, I didn't think of non-HA mode, I will fix it right away > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984072#comment-16984072 ] HuangTao commented on HDFS-15013: - [~surendrasingh] [~ayushtkn] sorry, I didn't think of non-HA mode, I will fix it right away > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982235#comment-16982235 ] HuangTao commented on HDFS-15013: - [~inigoiri] I have checked federationhealth.js, there is no seperate sync ajax request likes in dfshealth.js. {code:java} $.ajax({'url': '/conf', 'dataType': 'xml', 'async': false}).done( function(d) { var $xml = $(d); var namespace, nnId; $xml.find('property').each(function(idx,v) { if ($(v).find('name').text() === 'dfs.nameservice.id') { namespace = $(v).find('value').text(); } if ($(v).find('name').text() === 'dfs.ha.namenode.id') { nnId = $(v).find('value').text(); } }); if (namespace && nnId) { data['HAInfo'] = {"Namespace": namespace, "NamenodeID": nnId}; } });{code} > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982089#comment-16982089 ] HuangTao commented on HDFS-15013: - [~surendrasingh] PTAL > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15013: Attachment: HDFS-15013.001.patch Status: Patch Available (was: Open) > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-15013.001.patch, image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15013: Description: Now, the overview tab load /conf synchronously as follow picture. !image-2019-11-26-10-05-39-640.png! This issue will change it to an asynchronous method. The effect diagram is as follows. !image-2019-11-26-10-09-07-952.png! was: Now, the overview tab load /conf synchronously as follow picture. !image-2019-11-26-10-05-39-640.png|thumbnail! This issue will change it to an asynchronous method. The effect diagram is as follows. !image-2019-11-26-10-09-07-952.png|thumbnail! > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15013) Reduce NameNode overview tab response time
[ https://issues.apache.org/jira/browse/HDFS-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-15013: Description: Now, the overview tab load /conf synchronously as follow picture. !image-2019-11-26-10-05-39-640.png|thumbnail! This issue will change it to an asynchronous method. The effect diagram is as follows. !image-2019-11-26-10-09-07-952.png|thumbnail! was: Now, the overview tab load /conf synchronously as follow picture. !image-2019-11-26-10-05-39-640.png|thumbnail! This issue will change it to an asynchronous method. The effect diagram is as follows. !image-2019-11-26-10-09-07-952.png|thumbnail! > Reduce NameNode overview tab response time > -- > > Key: HDFS-15013 > URL: https://issues.apache.org/jira/browse/HDFS-15013 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: HuangTao >Assignee: HuangTao >Priority: Minor > Fix For: 3.3.0 > > Attachments: image-2019-11-26-10-05-39-640.png, > image-2019-11-26-10-09-07-952.png > > > Now, the overview tab load /conf synchronously as follow picture. > !image-2019-11-26-10-05-39-640.png|thumbnail! > This issue will change it to an asynchronous method. The effect diagram is as > follows. > !image-2019-11-26-10-09-07-952.png|thumbnail! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15013) Reduce NameNode overview tab response time
HuangTao created HDFS-15013: --- Summary: Reduce NameNode overview tab response time Key: HDFS-15013 URL: https://issues.apache.org/jira/browse/HDFS-15013 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: HuangTao Assignee: HuangTao Fix For: 3.3.0 Attachments: image-2019-11-26-10-05-39-640.png, image-2019-11-26-10-09-07-952.png Now, the overview tab load /conf synchronously as follow picture. !image-2019-11-26-10-05-39-640.png|thumbnail! This issue will change it to an asynchronous method. The effect diagram is as follows. !image-2019-11-26-10-09-07-952.png|thumbnail! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: the internal block is replicated many times when datanode is decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Description: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC internal block in that datanode will be replicated many times. // added 2019/09/19 I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! was: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! > Erasure Coding: the internal block is replicated many times when datanode is > decommissioning > > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC internal > block in that datanode will be replicated many times. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: the internal block is replicated many times when datanode is decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Summary: Erasure Coding: the internal block is replicated many times when datanode is decommissioning (was: Erasure Coding: replicate block infinitely when datanode being decommissioning) > Erasure Coding: the internal block is replicated many times when datanode is > decommissioning > > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933911#comment-16933911 ] HuangTao commented on HDFS-14849: - Yes [~ferhui], just more than one, sometimes the number of an internal block equal the amount of racks. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933455#comment-16933455 ] HuangTao commented on HDFS-14849: - [~ferhui] I am trying to find the root cause, will sync here. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933376#comment-16933376 ] HuangTao edited comment on HDFS-14849 at 9/19/19 1:35 PM: -- I find a clue: the `chooseSourceDatanodes` get {quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=22{quote} and all block index (0-8) exists, and three blocks 3/4/8 have no redundant block, and the datanode where block 8 stored is in DECOMMISSIONING, other two datanode adminState is null. {quote}[0, 1, 2, 3, 4, 5, 6, 7, 8, 6, 7, 6, 6, 5, 0, 1, 5, 0, 2, 5, 2, 5, 1, 2, 1, 5, 2, 7, 5, 2, 0]{quote} the `countNodes(block)` get {quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=16{quote} so we need to replicate block 8, but there is no racks anymore. Now, I have a doubt why replicate some block more than once other than replicate the block 8 ? was (Author: marvelrock): I find a clue: the `chooseSourceDatanodes` get {quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=22{quote} and all block index (0-8) exists, and three blocks 3/4/8 have no redundant block, and the datanode where block 8 stored is in DECOMMISSIONING, other two datanode adminState is null. the `countNodes(block)` get {quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=16{quote} so we need to replicate block 8, but there is no racks anymore. Now, I have a doubt why replicate some block more than once other than replicate the block 8 ? > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Comment: was deleted (was: !liveBlockIndices.png! ) > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933382#comment-16933382 ] HuangTao commented on HDFS-14849: - !liveBlockIndices.png! > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Attachment: liveBlockIndices.png > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933376#comment-16933376 ] HuangTao edited comment on HDFS-14849 at 9/19/19 1:30 PM: -- I find a clue: the `chooseSourceDatanodes` get {quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=22{quote} and all block index (0-8) exists, and three blocks 3/4/8 have no redundant block, and the datanode where block 8 stored is in DECOMMISSIONING, other two datanode adminState is null. the `countNodes(block)` get {quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=16{quote} so we need to replicate block 8, but there is no racks anymore. Now, I have a doubt why replicate some block more than once other than replicate the block 8 ? was (Author: marvelrock): I find a clue: the `chooseSourceDatanodes` get {quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=22{quote} and all block index (0-8) exists, and three blocks 3/4/8 have no redundant block, and the datanode where block 8 stored is in DECOMMISSIONING, other two datanode adminState is null. the `countNodes(block)` get {quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=16{quote} so we need to replicate block 8, but there is no racks anymore. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933376#comment-16933376 ] HuangTao commented on HDFS-14849: - I find a clue: the `chooseSourceDatanodes` get {quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=22{quote} and all block index (0-8) exists, and three blocks 3/4/8 have no redundant block, and the datanode where block 8 stored is in DECOMMISSIONING, other two datanode adminState is null. the `countNodes(block)` get {quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, STALESTORAGE=0, REDUNDANT=16{quote} so we need to replicate block 8, but there is no racks anymore. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Description: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! was: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933266#comment-16933266 ] HuangTao commented on HDFS-14849: - [~ferhui] We have the same scenario, but our fix can't pass each other's UT. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Attachment: HDFS-14849.002.patch > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, > fsck-file.png, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Description: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! was: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 165 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, fsck-file.png, > scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes > simultaneously. > > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Description: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. // added 2019/09/19 I reproduced this scenario in a 165 nodes cluster with decommission 100 nodes simultaneously. !scheduleReconstruction.png! !fsck-file.png! was: When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in that datanode will be replicated infinitely. > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, fsck-file.png, > scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. > // added 2019/09/19 > I reproduced this scenario in a 165 nodes cluster with decommission 100 nodes > simultaneously. > > !scheduleReconstruction.png! > !fsck-file.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Attachment: fsck-file.png > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, fsck-file.png, > scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HuangTao updated HDFS-14849: Attachment: scheduleReconstruction.png > Erasure Coding: replicate block infinitely when datanode being decommissioning > -- > > Key: HDFS-14849 > URL: https://issues.apache.org/jira/browse/HDFS-14849 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Labels: EC, HDFS, NameNode > Attachments: HDFS-14849.001.patch, scheduleReconstruction.png > > > When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in > that datanode will be replicated infinitely. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org