[jira] [Comment Edited] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140300#comment-17140300 ] bhji123 edited comment on HDFS-15419 at 6/19/20, 7:33 AM: -- Yes, router is just a proxy, but it's also a server. Clients can decide whether wait/retry or not. But not all clients are so clever, especially when there is a variety of different clients. For those not that smart clients, this pr is very useful. For those very smart clients who don't want router to retry, it's ok too because now router retry is configurable. was (Author: bhji123): Yes, router is just a proxy, and it's also a server. Clients can decide whether wait/retry or not. But not all clients are so clever, especially when there is a variety of different clients. For those not that smart clients, this pr is very useful. For those very smart clients who don't want router to retry, it's ok too because now router retry is configurable. > RBF: Router should retry communicate with NN when cluster is unavailable > using configurable time interval > - > > Key: HDFS-15419 > URL: https://issues.apache.org/jira/browse/HDFS-15419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: configuration, hdfs-client, rbf >Reporter: bhji123 >Priority: Major > > When cluster is unavailable, router -> namenode communication will only retry > once without any time interval, that is not reasonable. > For example, in my company, which has several hdfs clusters with more than > 1000 nodes, we have encountered this problem. In some cases, the cluster > becomes unavailable briefly for about 10 or 30 seconds, at the same time, > almost all rpc requests to router failed because router only retry once > without time interval. > It's better for us to enhance the router retry strategy, to retry > **communicate with NN using configurable time interval and max retry times. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-15415: - Attachment: HDFS-15415.001.patch > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15415.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement here: > 1. These lines of code in DirectoryScanner#scan(), obtain a snapshot of the > finalized blocks from memory, and then sort them, under the DN lock. However > the blocks are stored in a sorted structure (FoldedTreeSet) and hence the > sort should be unnecessary. > {code} > final List bl = dataset.getFinalizedBlocks(bpid); > Collections.sort(bl); // Sort based on blockId > {code} > 2. From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` as above we have taken a snapshot > of in memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140397#comment-17140397 ] Stephen O'Donnell commented on HDFS-15415: -- Uploaded initial patch to remove the unnecessary sort. In doing this I renamed FsDatasetSpi#getFinalizedBlocks to getSortedFinalizedBlocks and added a unit test to ensure it returns a sorted list. This is to ensure that any future change to the datanode internal block map does not break the sorting somehow. I still need to look at the logic performed under the lock to see if we can reduce the scope of the lock safely. > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15415.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement here: > 1. These lines of code in DirectoryScanner#scan(), obtain a snapshot of the > finalized blocks from memory, and then sort them, under the DN lock. However > the blocks are stored in a sorted structure (FoldedTreeSet) and hence the > sort should be unnecessary. > {code} > final List bl = dataset.getFinalizedBlocks(bpid); > Collections.sort(bl); // Sort based on blockId > {code} > 2. From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` as above we have taken a snapshot > of in memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140477#comment-17140477 ] Stephen O'Donnell commented on HDFS-15415: -- I have annotated the main loop the DirectoryScanner runs under the datanode lock below, with commends starting "SOD:". Its important to keep in mind that this compare phase creates a list of differences. These differences are then checked again block by block under the datanode lock in the reconcile step. Some "incorrect" differences are likely to be recorded even under the lock, as scanning the disks will take time. This scan is performed outside of the lock, so the DN could be appending, deleting and adding blocks during this time. Therefore if some more changes happen when comparing the disk results to the in-memory blocks, it is not a big deal. They will get re-check and resolved during the reconcile step. I have a small concern that if the disk balancer is running and moving blocks around it could cause more differences. However I don't see any protection against that when scanning the volumes either, so a block could potentially be counted on vol1, moved to vol2 and then counted again. Overall, I feel it is safe to limit the lock to to be around the call to `dataset.getSortedFinalizedBlocks(bpid)` only. Please let me know if anyone thinks that is wrong, or I am missing something obvious. {code} // Hold FSDataset lock to prevent further changes to the block map try (AutoCloseableLock lock = dataset.acquireDatasetLock()) { for (final String bpid : blockPoolReport.getBlockPoolIds()) { List blockpoolReport = blockPoolReport.getScanInfo(bpid); Stats statsRecord = new Stats(bpid); stats.put(bpid, statsRecord); Collection diffRecord = new ArrayList<>(); statsRecord.totalBlocks = blockpoolReport.size(); // Need to hold a lock here to prevent the replica map changing final List bl = dataset.getSortedFinalizedBlocks(bpid); // SOD: After here, were have a "snapshot" of the replicas that were in the // replica map. It doesn't really matter if those replicas change or // not as we go through the checks, as we are working off the snapshot. // The in-memory version will have diverged from the on-disk details as // the disk is scanned anyway. int d = 0; // index for blockpoolReport int m = 0; // index for memReprot while (m < bl.size() && d < blockpoolReport.size()) { ReplicaInfo memBlock = bl.get(m); ScanInfo info = blockpoolReport.get(d); // SOD: This block is safe to run outside of the lock if (info.getBlockId() < memBlock.getBlockId()) { // SOD: isDeletingBlock() is a synchronized method, so we don't need a // lock to check it. if (!dataset.isDeletingBlock(bpid, info.getBlockId())) { // Block is missing in memory statsRecord.missingMemoryBlocks++; addDifference(diffRecord, statsRecord, info); } d++; continue; } // SOD: This is safe outside the lock if (info.getBlockId() > memBlock.getBlockId()) { // Block is missing on the disk addDifference(diffRecord, statsRecord, memBlock.getBlockId(), info.getVolume()); m++; continue; } // Block file and/or metadata file exists on the disk // Block exists in memory // SOD: This branch looks safe if (info.getVolume().getStorageType() != StorageType.PROVIDED && info.getBlockFile() == null) { // Block metadata file exits and block file is missing addDifference(diffRecord, statsRecord, info); // SOD: This if we don't have a lock, an append or truncate could alter the // block length or gen stamp. However, these could already have changed // as the disk was scanned. Therefore I believe it is safe to do this // outside the lock. Worst case we gather some extra differences, but // they get handled in the reconcile step. } else if (info.getGenStamp() != memBlock.getGenerationStamp() || info.getBlockLength() != memBlock.getNumBytes()) { // Block metadata file is missing or has wrong generation stamp, // or block file length is different than expected statsRecord.mismatchBlocks++; addDifference(diffRecord, statsRecord, info); // SOD: The compareWith method checks the expected locations of the // block (ie vol/subdir/subdir/blk_ with what was found on the disk // scan. This section is a concern, as the disk balancer could move // a block and then this change would log a d
[jira] [Created] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
Kihwal Lee created HDFS-15421: - Summary: IBR leak causes standby NN to be stuck in safe mode Key: HDFS-15421 URL: https://issues.apache.org/jira/browse/HDFS-15421 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee After HDFS-14941, update of the global gen stamp is delayed in certain situations. This makes the last set of incremental block reports from append "from future", which causes it to be simply re-queued to the pending DN message queue, rather than processed to complete the block. The last set of IBRs will leak and never cleaned until it transitions to active. The size of {{pendingDNMessages}} constantly grows until then. If a leak happens while in a startup safe mode, the namenode will never be able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140597#comment-17140597 ] Kihwal Lee commented on HDFS-15421: --- This is an example of "stuck safe mode" from one of our small test clusters: {noformat} The reported blocks 3045352 needs additional 14058 blocks to reach the threshold 1. of total blocks 3059410. The minimum number of live datanodes is not required. Safe mode will be turned off automatically once the thresholds have been reached. 2020-06-11 18:35:19,863 [Block report processor] INFO hdfs.StateChange: STATE* Safe mode extension entered. The reported blocks 3059410 has reached the threshold 1. of total blocks 3059410. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in 30 seconds. 2020-06-11 18:35:25,036 [Edit log tailer] INFO namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@259766e0 expecting start txid #3427451497 2020-06-11 18:35:25,036 [Edit log tailer] INFO namenode.FSImage: Start loading edits file xxx 2020-06-11 18:35:25,036 [Edit log tailer] INFO namenode.RedundantEditLogInputStream: Fast-forwarding stream 'xxx' to transaction ID 3427451497 2020-06-11 18:35:25,060 [Edit log tailer] INFO namenode.FSImage: Loaded 1 edits file(s) (the last named xxx of total size 19024.0, total edits 124.0, total load time 25.0 ms 2020-06-11 18:35:39,868 [org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode$SafeModeMonitor@6d4a65c6] INFO hdfs.StateChange: STATE* Safe mode ON, in safe mode extension. The reported blocks 3059416 needs additional 1 blocks to reach the threshold 1. of total blocks 3059417. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in 9 seconds. 2020-06-11 18:35:59,873 [org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode$SafeModeMonitor@6d4a65c6] INFO hdfs.StateChange: STATE* Safe mode ON, thresholds not met. The reported blocks 3059416 needs additional 1 blocks to reach the threshold 1. of total blocks 3059417. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in -10 seconds. 2020-06-11 18:36:19,880 [org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode$SafeModeMonitor@6d4a65c6] INFO hdfs.StateChange: STATE* Safe mode ON, thresholds not met. The reported blocks 3059416 needs additional 1 blocks to reach the threshold 1. of total blocks 3059417. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in -30 seconds. 2020-06-11 18:36:39,888 [org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode$SafeModeMonitor@6d4a65c6] INFO hdfs.StateChange: STATE* Safe mode ON, thresholds not met. The reported blocks 3059416 needs additional 1 blocks to reach the threshold 1. of total blocks 3059417. {noformat} The time in extension indefinitely grows negatively and the additionally required blocks increase as more IBRs leak. You can force it out of safe mode, but the leak continues until a HA transition. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Critical > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14941) Potential editlog race condition can cause corrupted file
[ https://issues.apache.org/jira/browse/HDFS-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140608#comment-17140608 ] Kihwal Lee commented on HDFS-14941: --- Filed HDFS-1542 with more details. > Potential editlog race condition can cause corrupted file > - > > Key: HDFS-14941 > URL: https://issues.apache.org/jira/browse/HDFS-14941 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Labels: ha > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-14941.001.patch, HDFS-14941.002.patch, > HDFS-14941.003.patch, HDFS-14941.004.patch, HDFS-14941.005.patch, > HDFS-14941.006.patch > > > Recently we encountered an issue that, after a failover, NameNode complains > corrupted file/missing blocks. The blocks did recover after full block > reports, so the blocks are not actually missing. After further investigation, > we believe this is what happened: > First of all, on SbN, it is possible that it receives block reports before > corresponding edit tailing happened. In which case SbN postpones processing > the DN block report, handled by the guarding logic below: > {code:java} > if (shouldPostponeBlocksFromFuture && > namesystem.isGenStampInFuture(iblk)) { > queueReportedBlock(storageInfo, iblk, reportedState, > QUEUE_REASON_FUTURE_GENSTAMP); > continue; > } > {code} > Basically if reported block has a future generation stamp, the DN report gets > requeued. > However, in {{FSNamesystem#storeAllocatedBlock}}, we have the following code: > {code:java} > // allocate new block, record block locations in INode. > newBlock = createNewBlock(); > INodesInPath inodesInPath = INodesInPath.fromINode(pendingFile); > saveAllocatedBlock(src, inodesInPath, newBlock, targets); > persistNewBlock(src, pendingFile); > offset = pendingFile.computeFileSize(); > {code} > The line > {{newBlock = createNewBlock();}} > Would log an edit entry {{OP_SET_GENSTAMP_V2}} to bump generation stamp on > Standby > while the following line > {{persistNewBlock(src, pendingFile);}} > would log another edit entry {{OP_ADD_BLOCK}} to actually add the block on > Standby. > Then the race condition is that, imagine Standby has just processed > {{OP_SET_GENSTAMP_V2}}, but not yet {{OP_ADD_BLOCK}} (if they just happen to > be in different setment). Now a block report with new generation stamp comes > in. > Since the genstamp bump has already been processed, the reported block may > not be considered as future block. So the guarding logic passes. But > actually, the block hasn't been added to blockmap, because the second edit is > yet to be tailed. So, the block then gets added to invalidate block list and > we saw messages like: > {code:java} > BLOCK* addBlock: block XXX on node XXX size XXX does not belong to any file > {code} > Even worse, since this IBR is effectively lost, the NameNode has no > information about this block, until the next full block report. So after a > failover, the NN marks it as corrupt. > This issue won't happen though, if both of the edit entries get tailed all > together, so no IBR processing can happen in between. But in our case, we set > edit tailing interval to super low (to allow Standby read), so when under > high workload, there is a much much higher chance that the two entries are > tailed separately, causing the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140607#comment-17140607 ] Kihwal Lee commented on HDFS-15421: --- Example of a leak itself: (single replica shown for simplicity) 1) IBRs queued. The file was created, data written to it and closed. Then it was opened for append, additional data written and closed. {noformat} 2020-06-19 02:38:27,423 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774416 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. 2020-06-19 02:38:28,190 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 2) Processing of queued IBRs as edits replayed. The IBR with the first gen stamp for the initial file is processed. The one from append is not, as the gen stamp is still in the future. It is re-queued. {noformat} 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774416, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 3) When the edits for append is replayed. The IBR is still identified as from future and re-queued. Since there is no more edits regarding this file, the IBR is leaked. {noformat} 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} With HDFS-14941 reverted, the last IBR is processed as expected and the leak does not happen anymore. > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Critical > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-15421: -- Priority: Blocker (was: Critical) > IBR leak causes standby NN to be stuck in safe mode > --- > > Key: HDFS-15421 > URL: https://issues.apache.org/jira/browse/HDFS-15421 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Blocker > > After HDFS-14941, update of the global gen stamp is delayed in certain > situations. This makes the last set of incremental block reports from append > "from future", which causes it to be simply re-queued to the pending DN > message queue, rather than processed to complete the block. The last set of > IBRs will leak and never cleaned until it transitions to active. The size of > {{pendingDNMessages}} constantly grows until then. > If a leak happens while in a startup safe mode, the namenode will never be > able to come out of safe mode on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140607#comment-17140607 ] Kihwal Lee edited comment on HDFS-15421 at 6/19/20, 2:56 PM: - Example of a leak itself: (single replica shown for simplicity) 1) IBRs queued. The file was created, data written to it and closed. Then it was opened for append, additional data written and closed. {noformat} 2020-06-19 02:38:27,423 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774416 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. 2020-06-19 02:38:28,190 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 2) Processing of queued IBRs as edits replayed. The IBR with the first gen stamp for the initial file is processed. The one from append is not, as the gen stamp is still in the future. It is re-queued. {noformat} 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774416, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 3) When the edits for append is replayed. The IBR is still identified as from future and re-queued. Since there is no more edits regarding this file, the IBR is leaked. {noformat} 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} With HDFS-14941 reverted, the last IBR is processed as expected and the leak does not happen anymore. Note: The original logging level of above lines are DEBUG, but was changed to INFO temporarily. was (Author: kihwal): Example of a leak itself: (single replica shown for simplicity) 1) IBRs queued. The file was created, data written to it and closed. Then it was opened for append, additional data written and closed. {noformat} 2020-06-19 02:38:27,423 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774416 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. 2020-06-19 02:38:28,190 [Block report processor] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 2) Processing of queued IBRs as edits replayed. The IBR with the first gen stamp for the initial file is processed. The one from append is not, as the gen stamp is still in the future. It is re-queued. {noformat} 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774416, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,774 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for later processing because generation stamp is in the future. {noformat} 3) When the edits for append is replayed. The IBR is still identified as from future and re-queued. Since there is no more edits regarding this file, the IBR is leaked. {noformat} 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Processing previouly queued message ReportedBlockInfo [block=blk_1521788462_1099975774420, dn=1.2.3.4:1004, reportedState=FINALIZED] 2020-06-19 02:42:22,776 [Edit log tailer] INFO blockmanagement.BlockManager: Queueing reported block blk_1521788462_1099975774420 in state FINALIZED from datanode 1.2.3.4:1004 for lat
[jira] [Created] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.
Kihwal Lee created HDFS-15422: - Summary: Reported IBR is partially replaced with stored info when queuing. Key: HDFS-15422 URL: https://issues.apache.org/jira/browse/HDFS-15422 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee When queueing an IBR (incremental block report) on a standby namenode, some of the reported information is being replaced with the existing stored information. This can lead to false block corruption. We had a namenode, after transitioning to active, started reporting missing blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were appended and the sizes were actually correct on the datanodes. Upon further investigation, it was determined that the namenode was queueing IBRs with altered information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.
[ https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140618#comment-17140618 ] Kihwal Lee commented on HDFS-15422: --- The fix is simple. {code} @@ -2578,10 +2578,7 @@ private BlockInfo processReportedBlock( // If the block is an out-of-date generation stamp or state, // but we're the standby, we shouldn't treat it as corrupt, // but instead just queue it for later processing. -// TODO: Pretty confident this should be s/storedBlock/block below, -// since we should be postponing the info of the reported block, not -// the stored block. See HDFS-6289 for more context. -queueReportedBlock(storageInfo, storedBlock, reportedState, +queueReportedBlock(storageInfo, block, reportedState, QUEUE_REASON_CORRUPT_STATE); } else { toCorrupt.add(c); {code} If the old information in memory({{storedBlock}}) is used in queueing a report, the size may be old. Unlike GENSTAMP_MISMATCH, this kind of corruption can be undone when the NN sees a correct report again. I.e. forcing a block report won't fix this condition. > Reported IBR is partially replaced with stored info when queuing. > - > > Key: HDFS-15422 > URL: https://issues.apache.org/jira/browse/HDFS-15422 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Critical > > When queueing an IBR (incremental block report) on a standby namenode, some > of the reported information is being replaced with the existing stored > information. This can lead to false block corruption. > We had a namenode, after transitioning to active, started reporting missing > blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were > appended and the sizes were actually correct on the datanodes. Upon further > investigation, it was determined that the namenode was queueing IBRs with > altered information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-15415: - Status: Patch Available (was: Open) > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15415.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement here: > 1. These lines of code in DirectoryScanner#scan(), obtain a snapshot of the > finalized blocks from memory, and then sort them, under the DN lock. However > the blocks are stored in a sorted structure (FoldedTreeSet) and hence the > sort should be unnecessary. > {code} > final List bl = dataset.getFinalizedBlocks(bpid); > Collections.sort(bl); // Sort based on blockId > {code} > 2. From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` as above we have taken a snapshot > of in memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140699#comment-17140699 ] hemanthboyina commented on HDFS-15415: -- thanks [~sodonnell] for your analysis after taking the snapshot , if we did not acquire the lock , have you considered the scenario if the blocks being converted from RBW to FINALIZED ? {quote}A finalized block could be appended. If that happens both the genstamp and length will change {quote} agree with you , though the replica will be changed from FINALIZED to RBW , so anyways we are getting only the finalized blocks it shouldnt be problem > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15415.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement here: > 1. These lines of code in DirectoryScanner#scan(), obtain a snapshot of the > finalized blocks from memory, and then sort them, under the DN lock. However > the blocks are stored in a sorted structure (FoldedTreeSet) and hence the > sort should be unnecessary. > {code} > final List bl = dataset.getFinalizedBlocks(bpid); > Collections.sort(bl); // Sort based on blockId > {code} > 2. From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` as above we have taken a snapshot > of in memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140718#comment-17140718 ] Íñigo Goiri commented on HDFS-15410: We can fix the checkstyle. What about adding hdfs as a prefix for the new config? It is true that we haven't added the prefix in none of the classes but HDFS is definetely needed here. BTW, are the current tests covering this change indirectly? > Add separated config file fedbalance-default.xml for fedbalance tool > > > Key: HDFS-15410 > URL: https://issues.apache.org/jira/browse/HDFS-15410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15410.001.patch, HDFS-15410.002.patch > > > Add a separated config file named fedbalance-default.xml for fedbalance tool > configs. It's like the ditcp-default.xml for distcp tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.
[ https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140720#comment-17140720 ] Íñigo Goiri commented on HDFS-15416: Let's go with the patch here instead of the PR. Can we add a test? > DataStorage#addStorageLocations() should add more reasonable information > verification. > -- > > Key: HDFS-15416 > URL: https://issues.apache.org/jira/browse/HDFS-15416 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0, 3.1.1 >Reporter: jianghua zhu >Assignee: jianghua zhu >Priority: Major > Attachments: HDFS-15416.patch > > > SuccessLocations content is an array, when the number is 0, do not need to be > executed again loadBlockPoolSliceStorage (). > code : > try > { > final List successLocations = loadDataStorage( datanode, > nsInfo, dataDirs, startOpt, executor); > return loadBlockPoolSliceStorage( datanode, nsInfo, successLocations, > startOpt, executor); } > finally > { executor.shutdown(); } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15417) RBF: Lazy get the datanode report for federation WebHDFS operations
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Priority: Major (was: Minor) > RBF: Lazy get the datanode report for federation WebHDFS operations > --- > > Key: HDFS-15417 > URL: https://issues.apache.org/jira/browse/HDFS-15417 > Project: Hadoop HDFS > Issue Type: Improvement > Components: federation, rbf, webhdfs >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Major > > *Why* > For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or > namenode needs to get the datanodes where the block is located, then redirect > the request to one of the datanodes. > However, this chooseDatanode action in router is much slower than namenode, > which directly affects the WebHDFS operations above. > For namenode WebHDFS, it normally takes tens of milliseconds, while router > always takes more than 2 seconds. > *How* > Only get the datanode report when necessary in router. It is a very expense > operation where all the time is spent on. > This is only needed when we want to exclude some datanodes or find a random > datanode for CREATE. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15417) RBF: Get the datanode report from cache for federation WebHDFS operations
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Summary: RBF: Get the datanode report from cache for federation WebHDFS operations (was: RBF: Lazy get the datanode report for federation WebHDFS operations) > RBF: Get the datanode report from cache for federation WebHDFS operations > - > > Key: HDFS-15417 > URL: https://issues.apache.org/jira/browse/HDFS-15417 > Project: Hadoop HDFS > Issue Type: Improvement > Components: federation, rbf, webhdfs >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Major > > *Why* > For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or > namenode needs to get the datanodes where the block is located, then redirect > the request to one of the datanodes. > However, this chooseDatanode action in router is much slower than namenode, > which directly affects the WebHDFS operations above. > For namenode WebHDFS, it normally takes tens of milliseconds, while router > always takes more than 2 seconds. > *How* > Only get the datanode report when necessary in router. It is a very expense > operation where all the time is spent on. > This is only needed when we want to exclude some datanodes or find a random > datanode for CREATE. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15417) RBF: Get the datanode report from cache for federation WebHDFS operations
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Description: *Why* For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or namenode needs to get the datanodes where the block is located, then redirect the request to one of the datanodes. However, this chooseDatanode action in router is much slower than namenode, which directly affects the WebHDFS operations above. For namenode WebHDFS, it normally takes tens of milliseconds, while router always takes more than 2 seconds. *How* Cache the datanode report in router RPC server. Actively refresh with a configured interval. Only get the datanode report when necessary in router. It is a very expense operation where all the time is spent on. This is only needed when we want to exclude some datanodes or find a random datanode for CREATE. was: *Why* For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or namenode needs to get the datanodes where the block is located, then redirect the request to one of the datanodes. However, this chooseDatanode action in router is much slower than namenode, which directly affects the WebHDFS operations above. For namenode WebHDFS, it normally takes tens of milliseconds, while router always takes more than 2 seconds. *How* Only get the datanode report when necessary in router. It is a very expense operation where all the time is spent on. This is only needed when we want to exclude some datanodes or find a random datanode for CREATE. > RBF: Get the datanode report from cache for federation WebHDFS operations > - > > Key: HDFS-15417 > URL: https://issues.apache.org/jira/browse/HDFS-15417 > Project: Hadoop HDFS > Issue Type: Improvement > Components: federation, rbf, webhdfs >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Major > > *Why* > For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or > namenode needs to get the datanodes where the block is located, then redirect > the request to one of the datanodes. > However, this chooseDatanode action in router is much slower than namenode, > which directly affects the WebHDFS operations above. > For namenode WebHDFS, it normally takes tens of milliseconds, while router > always takes more than 2 seconds. > *How* > Cache the datanode report in router RPC server. Actively refresh with a > configured interval. Only get the datanode report when necessary in router. > It is a very expense operation where all the time is spent on. > This is only needed when we want to exclude some datanodes or find a random > datanode for CREATE. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140779#comment-17140779 ] Stephen O'Donnell commented on HDFS-15415: -- If a block is RBW or RUR before the snapshot of memory is taken, then it will never be part of the in memory blocks for that pass of the scanner. RBW should also be skipped by the disk scan too. If a block goes FINALIZED to RBW (due to append), they we may records a difference or we may not depending on the sequence of events. There are always going to be some "false positives" in the comparison, as the disk picture will always be changing before we take the lock, even with the code as it is now. That is why I believe we can do the processing against the memory snapshot without the lock. The price we pay, is possibly some more differences which have to be reconciled later. The faster we can make the scan step, the less false positives there will be for reconcile later. As with many of these locking problems, it is hard to be 100% sure this will not cause some other problems, but from what I looked at today, I think it should be good. > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15415.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement here: > 1. These lines of code in DirectoryScanner#scan(), obtain a snapshot of the > finalized blocks from memory, and then sort them, under the DN lock. However > the blocks are stored in a sorted structure (FoldedTreeSet) and hence the > sort should be unnecessary. > {code} > final List bl = dataset.getFinalizedBlocks(bpid); > Collections.sort(bl); // Sort based on blockId > {code} > 2. From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` as above we have taken a snapshot > of in memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140812#comment-17140812 ] Hadoop QA commented on HDFS-15415: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 5s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 224 unchanged - 0 fixed = 225 total (was 224) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}125m 37s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 7s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}200m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29441/artifact/out/Dockerfile | | JIRA Issue | HDFS-15415 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006051/HDFS-15415.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7a051fef76dc 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | per
[jira] [Updated] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.
[ https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-15422: -- Description: When queueing an IBR (incremental block report) on a standby namenode, some of the reported information is being replaced with the existing stored information. This can lead to false block corruption. We had a namenode, after transitioning to active, started reporting missing blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were appended and the sizes were actually correct on the datanodes. Upon further investigation, it was determined that the namenode was queueing IBRs with altered information. Although it sounds bad, I am not making it blocker was: When queueing an IBR (incremental block report) on a standby namenode, some of the reported information is being replaced with the existing stored information. This can lead to false block corruption. We had a namenode, after transitioning to active, started reporting missing blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were appended and the sizes were actually correct on the datanodes. Upon further investigation, it was determined that the namenode was queueing IBRs with altered information. > Reported IBR is partially replaced with stored info when queuing. > - > > Key: HDFS-15422 > URL: https://issues.apache.org/jira/browse/HDFS-15422 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Priority: Critical > > When queueing an IBR (incremental block report) on a standby namenode, some > of the reported information is being replaced with the existing stored > information. This can lead to false block corruption. > We had a namenode, after transitioning to active, started reporting missing > blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were > appended and the sizes were actually correct on the datanodes. Upon further > investigation, it was determined that the namenode was queueing IBRs with > altered information. > Although it sounds bad, I am not making it blocker -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13082) cookieverf mismatch error over NFS gateway on Linux
[ https://issues.apache.org/jira/browse/HDFS-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140885#comment-17140885 ] Daniel Howard commented on HDFS-13082: -- I am running into this as well, but the AIX compatibility trick did not help. For example: {{0-15:58 djh@c24-03-06 ~> *ls /hadoop/wxxxs/data/*}} {{# No files listed}} {{0-16:01 djh@c24-03-06 ~> *touch /hadoop/wxxxs/data/foo*}} {{0-16:01 djh@c24-03-06 ~> *ls /hadoop/wxxxs/data/*}} {{foo packed-hbfs/ raw/ tmp/}} {{0-16:01 djh@c24-03-06 ~> *rm /hadoop/wxxxs/data/foo*}} {{0-16:01 djh@c24-03-06 ~> *ls /hadoop/wxxxs/data/*}} {{packed-hbfs/ raw/ tmp/}} Writing to this directory forced the NFS server to return the correct directory contents. I have a bunch of this in the log: {{2020-06-19 16:01:35,281 ERROR org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3: cookieverf mismatch. request cookieverf: 1591897331315 dir cookieverf: 1592428367587}} {{2020-06-19 16:01:35,287 ERROR org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3: cookieverf mismatch. request cookieverf: 1591897331315 dir cookieverf: 1592428367587}} {{2020-06-19 16:01:35,454 ERROR org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3: cookieverf mismatch. request cookieverf: 1591897331315 dir cookieverf: 1592428367587}} I am tempted to fiddle with _dfs.namenode.accesstime.precision_ but .. ?! > cookieverf mismatch error over NFS gateway on Linux > --- > > Key: HDFS-13082 > URL: https://issues.apache.org/jira/browse/HDFS-13082 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3 >Reporter: Dan Moraru >Priority: Minor > > Running 'ls' on some directories over an HDFS-NFS gateway sometimes fails to > list the contents of those directories. Running 'ls' on those same > directories mounted via FUSE works. The NFS gateway logs errors like the > following: > 2018-01-29 11:53:01,130 ERROR org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3: > cookieverf mismatch. request cookieverf: 1513390944415 dir cookieverf: > 1516920857335 > Reviewing > hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java > suggested that these errors can be avoided by setting > nfs.aix.compatibility.mode.enabled=true, and that is indeed the case. The > documentation lists https://issues.apache.org/jira/browse/HDFS-6549 as a > known issue, but also goes on to say that "regular, non-AIX clients should > NOT enable AIX compatibility mode. The work-arounds implemented by AIX > compatibility mode effectively disable safeguards to ensure that listing of > directory contents via NFS returns consistent results, and that all data sent > to the NFS server can be assured to have been committed." Server and client > is this case are one and the same, running Scientific Linux 7.4. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15423) RBF: WebHDFS create shouldn't choose DN from all sub-clusters
Chao Sun created HDFS-15423: --- Summary: RBF: WebHDFS create shouldn't choose DN from all sub-clusters Key: HDFS-15423 URL: https://issues.apache.org/jira/browse/HDFS-15423 Project: Hadoop HDFS Issue Type: Bug Components: rbf Reporter: Chao Sun In {{RouterWebHdfsMethods}} and for a {{CREATE}} call, {{chooseDatanode}} first gets all DNs via {{getDatanodeReport}}, and then randomly pick one from the list via {{getRandomDatanode}}. This logic doesn't seem correct as it should pick a DN for the specific cluster(s) of the input {{path}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15423) RBF: WebHDFS create shouldn't choose DN from all sub-clusters
[ https://issues.apache.org/jira/browse/HDFS-15423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HDFS-15423: Component/s: webhdfs > RBF: WebHDFS create shouldn't choose DN from all sub-clusters > - > > Key: HDFS-15423 > URL: https://issues.apache.org/jira/browse/HDFS-15423 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf, webhdfs >Reporter: Chao Sun >Priority: Major > > In {{RouterWebHdfsMethods}} and for a {{CREATE}} call, {{chooseDatanode}} > first gets all DNs via {{getDatanodeReport}}, and then randomly pick one from > the list via {{getRandomDatanode}}. This logic doesn't seem correct as it > should pick a DN for the specific cluster(s) of the input {{path}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140960#comment-17140960 ] Yiqun Lin commented on HDFS-15410: -- Besides [~elgoiri]'s review comment, some more reivew comments from me: Not fully understand why we need to define the class impl config to do reflection and get the instance. Currently there is no other implement class, why not just create new FedBalance/BalanceJournalInfoHDFS instance in the code? From my understanding, this two config settings is can be removed. {code:java} federation.balance.class hadoop.hdfs.procedure.journal.class // init journal. Class clazz = (Class) conf .getClass(JOURNAL_CLASS, BalanceJournalInfoHDFS.class); journal = ReflectionUtils.newInstance(clazz, conf); Class balanceClazz = (Class) conf .getClass(FEDERATION_BALANCE_CLASS, FedBalance.class); Tool balancer = ReflectionUtils.newInstance(balanceClazz, conf); {code} Can we rename class name from {{DistCpBalanceOptions}} to {{FedBalanceOptions}}? This will look more readable that these options here are making sense for fedbalance tool. Can we rename config prefix from {{hadoop.hdfs.procedure.work.thread.num}} to {{hdfs.fedbalance.procedure.work.thread.num}}? Following description need to be updated here since -router option doesn't require to inout true or false as a parameter now. {noformat} final static Option ROUTER = new Option("router", false, "If `true` the command runs in router mode. The source path is " + "taken as a mount point. It will disable write by setting the mount" + " point readonly. Otherwise the command works in normal federation" + " mode. The source path is taken as the full path. It will disable" + " write by cancelling all permissions of the source path. The" + " default value is `true`."); {noformat} > Add separated config file fedbalance-default.xml for fedbalance tool > > > Key: HDFS-15410 > URL: https://issues.apache.org/jira/browse/HDFS-15410 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15410.001.patch, HDFS-15410.002.patch > > > Add a separated config file named fedbalance-default.xml for fedbalance tool > configs. It's like the ditcp-default.xml for distcp tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15424) Javadoc failing with "cannot find symbol com.google.protobuf.GeneratedMessageV3 implements"
Uma Maheswara Rao G created HDFS-15424: -- Summary: Javadoc failing with "cannot find symbol com.google.protobuf.GeneratedMessageV3 implements" Key: HDFS-15424 URL: https://issues.apache.org/jira/browse/HDFS-15424 Project: Hadoop HDFS Issue Type: Bug Reporter: Uma Maheswara Rao G {noformat} [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 17.982 s [INFO] Finished at: 2020-06-20T01:56:28Z [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc (default-cli) on project hadoop-hdfs: An error has occurred in Javadoc report generation: [ERROR] Exit code: 1 - javadoc: warning - You have specified the HTML version as HTML 4.01 by using the -html4 option. [ERROR] The default is currently HTML5 and the support for HTML 4.01 will be removed [ERROR] in a future release. To suppress this warning, please ensure that any HTML constructs [ERROR] in your comments are valid in HTML5, and remove the -html4 option. [ERROR] /home/jenkins/jenkins-slave/workspace/hadoop-multibranch_PR-2084/src/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/server/namenode/FsImageProto.java:25197: error: cannot find symbol [ERROR] com.google.protobuf.GeneratedMessageV3 implements [ERROR] ^ [ERROR] symbol: class GeneratedMessageV3 [ERROR] location: package com.google.protobuf [ERROR] /home/jenkins/jenkins-slave/workspace/hadoop-multibranch_PR-2084/src/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/server/namenode/FsImageProto.java:25319: error: cannot find symbol [ERROR] com.google.protobuf.GeneratedMessageV3 implements [ERROR]^ [ERROR] symbol: class GeneratedMessageV3 [ERROR] location: package com.google.protobuf [ERROR] /home/jenkins/jenkins-slave/workspace/hadoop-multibranch_PR-2084/src/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/server/namenode/FsImageProto.java:26068: error: cannot find symbol [ERROR] com.google.protobuf.GeneratedMessageV3 implements [ERROR]^ [ERROR] symbol: class GeneratedMessageV3 [ERROR] location: package com.google.protobuf [ERROR] /home/jenkins/jenkins-slave/workspace/hadoop-multibranch_PR-2084/src/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java/org/apache/hadoop/hdfs/server/namenode/FsImageProto.java:26073: error: package com.google.protobuf.GeneratedMessageV3 does not exist [ERROR] private PersistToken(com.google.protobuf.GeneratedMessageV3.Builder builder) { {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140283#comment-17140283 ] bhji123 commented on HDFS-15419: Yes, but clients may not configured appropriately. But if router can retry too, it will be more reliable. > RBF: Router should retry communicate with NN when cluster is unavailable > using configurable time interval > - > > Key: HDFS-15419 > URL: https://issues.apache.org/jira/browse/HDFS-15419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: configuration, hdfs-client, rbf >Reporter: bhji123 >Priority: Major > > When cluster is unavailable, router -> namenode communication will only retry > once without any time interval, that is not reasonable. > For example, in my company, which has several hdfs clusters with more than > 1000 nodes, we have encountered this problem. In some cases, the cluster > becomes unavailable briefly for about 10 or 30 seconds, at the same time, > almost all rpc requests to router failed because router only retry once > without time interval. > It's better for us to enhance the router retry strategy, to retry > **communicate with NN using configurable time interval and max retry times. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bhji123 updated HDFS-15419: --- Comment: was deleted (was: Yes, but clients may not configured appropriately. But if router can retry too, it will be more reliable.) > RBF: Router should retry communicate with NN when cluster is unavailable > using configurable time interval > - > > Key: HDFS-15419 > URL: https://issues.apache.org/jira/browse/HDFS-15419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: configuration, hdfs-client, rbf >Reporter: bhji123 >Priority: Major > > When cluster is unavailable, router -> namenode communication will only retry > once without any time interval, that is not reasonable. > For example, in my company, which has several hdfs clusters with more than > 1000 nodes, we have encountered this problem. In some cases, the cluster > becomes unavailable briefly for about 10 or 30 seconds, at the same time, > almost all rpc requests to router failed because router only retry once > without time interval. > It's better for us to enhance the router retry strategy, to retry > **communicate with NN using configurable time interval and max retry times. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140299#comment-17140299 ] Hadoop QA commented on HDFS-15410: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 35s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 11s{color} | {color:orange} hadoop-tools/hadoop-federation-balance: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s{color} | {color:green} hadoop-federation-balance in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29440/artifact/out/Dockerfile | | JIRA Issue | HDFS-15410 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006034/HDFS-15410.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 8231b6a1b035 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.s
[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140300#comment-17140300 ] bhji123 commented on HDFS-15419: Yes, router is just a proxy, and it's also a server. Clients can decide whether wait/retry or not. But not all clients are so clever, especially when there is a variety of different clients. For those not that smart clients, this pr is very useful. For those very smart clients who don't want router to retry, it's ok too because now router retry is configurable. > RBF: Router should retry communicate with NN when cluster is unavailable > using configurable time interval > - > > Key: HDFS-15419 > URL: https://issues.apache.org/jira/browse/HDFS-15419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: configuration, hdfs-client, rbf >Reporter: bhji123 >Priority: Major > > When cluster is unavailable, router -> namenode communication will only retry > once without any time interval, that is not reasonable. > For example, in my company, which has several hdfs clusters with more than > 1000 nodes, we have encountered this problem. In some cases, the cluster > becomes unavailable briefly for about 10 or 30 seconds, at the same time, > almost all rpc requests to router failed because router only retry once > without time interval. > It's better for us to enhance the router retry strategy, to retry > **communicate with NN using configurable time interval and max retry times. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org