[jira] [Created] (HDFS-12875) RBF: Complete logic for -readonly option of dfsrouteradmin add command
Yiqun Lin created HDFS-12875: Summary: RBF: Complete logic for -readonly option of dfsrouteradmin add command Key: HDFS-12875 URL: https://issues.apache.org/jira/browse/HDFS-12875 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0-alpha3 Reporter: Yiqun Lin Assignee: Yiqun Lin Currently the option -readonly of command {{dfsrouteradmin -add}} doesn't make any sense.The desired behavior is that read-only mount table that be set in add command cannot be removed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12528) Short-circuit reads unnecessarily disabled for a long time
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272279#comment-16272279 ] Weiwei Yang commented on HDFS-12528: bq. What's the risk if we don't disable the SCR at all when we get any IOException? This was some legacy code, I believe the purpose was to prevent something bad happens and SCR keeps failing. An alternative way to handle such case is to disable SCR when it fails with {{same unknown}} exception for a configurable number of times, with a default value e.g 5. And this also gives user a way to never disable it (by setting it to 0) if they want, like us. > Short-circuit reads unnecessarily disabled for a long time > -- > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: John Zhuge > Attachments: HDFS-12528.000.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12840) Creating a replicated file in a EC zone does not correctly serialized in EditLogs
[ https://issues.apache.org/jira/browse/HDFS-12840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272270#comment-16272270 ] SammiChen commented on HDFS-12840: -- Hi Eddy, thanks for working on it. some comments here, 1. {{REPLICATION_POLICY_ID}} is defined in {{ErasureCodeConstants}} already with value 63. Suggest reuse it. 2. {{TestRetryCacheWithHA}}, 40 instead of 41. bq. assertEquals("Retry cache size is wrong", 41, cacheSet.size()); > Creating a replicated file in a EC zone does not correctly serialized in > EditLogs > - > > Key: HDFS-12840 > URL: https://issues.apache.org/jira/browse/HDFS-12840 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12840.00.patch, HDFS-12840.01.patch, > HDFS-12840.02.patch, HDFS-12840.reprod.patch, editsStored, editsStored > > > When create a replicated file in an existing EC zone, the edit logs does not > differentiate it from an EC file. When {{FSEditLogLoader}} to replay edits, > this file is treated as EC file, as a results, it crashes the NN because the > blocks of this file are replicated, which does not match with {{INode}}. > {noformat} > ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered > exception on operation AddBlockOp [path=/system/balancer.id, > penultimateBlock=NULL, lastBlock=blk_1073743259_2455, RpcClientId=, > RpcCallId=-2] > java.lang.IllegalArgumentException: reportedBlock is not striped > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.addStorage(BlockInfoStriped.java:118) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.addBlock(DatanodeStorageInfo.java:256) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3141) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlockUnderConstruction(BlockManager.java:3068) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:3864) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessages(BlockManager.java:2916) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessagesForBlock(BlockManager.java:2903) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.addNewBlock(FSEditLogLoader.java:1069) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:532) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11640) [READ] Datanodes should use a unique identifier when reading from external stores
[ https://issues.apache.org/jira/browse/HDFS-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272260#comment-16272260 ] genericqa commented on HDFS-11640: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} HDFS-9806 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s{color} | {color:red} root in HDFS-9806 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 14m 57s{color} | {color:red} root in HDFS-9806 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} HDFS-9806 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 42s{color} | {color:red} hadoop-hdfs in HDFS-9806 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 1m 13s{color} | {color:red} hadoop-fs2img in HDFS-9806 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 18m 54s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 33s{color} | {color:red} hadoop-fs2img in HDFS-9806 failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green} HDFS-9806 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 39s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 19s{color} | {color:red} hadoop-fs2img in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 16m 47s{color} | {color:red} root generated 514 new + 725 unchanged - 0 fixed = 1239 total (was 725) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 28s{color} | {color:orange} root: The patch generated 34 new + 0 unchanged - 0 fixed = 34 total (was 0) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-fs2img in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 34s{color} | {color:red} hadoop-fs2img in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 28s{color} | {color:red} hadoop-fs2img in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Found reliance on default
[jira] [Commented] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272255#comment-16272255 ] genericqa commented on HDFS-12685: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HDFS-9806 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 47s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} HDFS-9806 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 52 unchanged - 2 fixed = 52 total (was 54) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-hdfs:3 | | Failed junit tests | hadoop.hdfs.client.impl.TestClientBlockVerification | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.web.TestWebHDFS | | | hadoop.hdfs.TestFileAppend3 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12685 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12899947/HDFS-12685-HDFS-9806.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 84da78a88b48 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-9806 / 6e805c0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Unreaped Processes Log | https://builds.apache.org/job/PreCommit-HDFS-Build/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-reaper.txt | | unit |
[jira] [Commented] (HDFS-12528) Short-circuit reads unnecessarily disabled for a long time
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272240#comment-16272240 ] Gang Xie commented on HDFS-12528: - What's the risk if we don't disable the SCR at all when we get any IOException? > Short-circuit reads unnecessarily disabled for a long time > -- > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: John Zhuge > Attachments: HDFS-12528.000.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11751) DFSZKFailoverController daemon exits with wrong status code
[ https://issues.apache.org/jira/browse/HDFS-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272228#comment-16272228 ] Brahma Reddy Battula commented on HDFS-11751: - +1 on the latest patch. will commit. > DFSZKFailoverController daemon exits with wrong status code > --- > > Key: HDFS-11751 > URL: https://issues.apache.org/jira/browse/HDFS-11751 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 3.0.0-alpha2 >Reporter: Doris Gu >Assignee: Bharat Viswanadham > Attachments: HDFS-11751.001.patch, HDFS-11751.02.patch > > > 1.use *hdfs zkfc* to start a zkfc daemon; > 2.zkfc failed to start, but we got the successful code. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12834) DFSZKFailoverController on error exits with 0 error code
[ https://issues.apache.org/jira/browse/HDFS-12834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272226#comment-16272226 ] Brahma Reddy Battula commented on HDFS-12834: - Sure,will look into HDFS-11751. Sorry for delay. > DFSZKFailoverController on error exits with 0 error code > > > Key: HDFS-12834 > URL: https://issues.apache.org/jira/browse/HDFS-12834 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.7.3, 3.0.0-alpha4 >Reporter: Zbigniew Kostrzewa >Assignee: Bharat Viswanadham > Attachments: HDFS-12834.00.patch, HDFS-12834.01.patch > > > On error {{DFSZKFailoverController}} exits with 0 return code which leads to > problems when integrating it with scripts and monitoring tools, e.g. systemd, > which when configured to restart the service only on failure does not restart > ZKFC because it exited with 0. > For example, in my case, systemd reported zkfc exited with success but in > logs I have found this: > {noformat} > 2017-11-14 05:33:55,075 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 3334ms for sessionid > 0x15fb794bd240001, closing socket connection and attempting reconnect > 2017-11-14 05:33:55,178 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session disconnected. Entering neutral mode... > 2017-11-14 05:33:55,564 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.73/10.9.4.73:2182. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:55,566 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.73/10.9.4.73:2182, initiating session > 2017-11-14 05:33:55,569 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server 10.9.4.73/10.9.4.73:2182, sessionid = > 0x15fb794bd240001, negotiated timeout = 5000 > 2017-11-14 05:33:55,570 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session connected. > 2017-11-14 05:33:58,230 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:33:58,335 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session disconnected. Entering neutral mode... > 2017-11-14 05:33:58,402 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.138/10.9.4.138:2181. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:58,403 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.138/10.9.4.138:2181, initiating session > 2017-11-14 05:33:58,406 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:33:59,218 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.228/10.9.4.228:2183. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:59,219 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.228/10.9.4.228:2183, initiating session > 2017-11-14 05:33:59,221 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:34:01,094 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.73/10.9.4.73:2182. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:34:01,094 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 1773ms for sessionid > 0x15fb794bd240001, closing socket connection and attempting reconnect > 2017-11-14 05:34:01,196 FATAL org.apache.hadoop.ha.ActiveStandbyElector: > Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further > znode monitoring connection errors. > 2017-11-14 05:34:02,153 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x15fb794bd240001 closed > 2017-11-14 05:34:02,154 FATAL org.apache.hadoop.ha.ZKFailoverController: > Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. > Not retrying further znode monitoring connection errors. > 2017-11-14 05:34:02,154 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2017-11-14 05:34:05,208 INFO org.apache.hadoop.ipc.Server: Stopping server on > 8019 > 2017-11-14 05:34:05,487 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 8019 > 2017-11-14 05:34:05,488 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server Responder > 2017-11-14 05:34:05,487 INFO
[jira] [Created] (HDFS-12874) [READ] Documentation for provided storage
Chris Douglas created HDFS-12874: Summary: [READ] Documentation for provided storage Key: HDFS-12874 URL: https://issues.apache.org/jira/browse/HDFS-12874 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Chris Douglas The configuration and deployment of provided storage should be documented for end-users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11640) [READ] Datanodes should use a unique identifier when reading from external stores
[ https://issues.apache.org/jira/browse/HDFS-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-11640: - Attachment: HDFS-11640-HDFS-9806.003.patch Rebased patch, made a minor tweak to the open in {{ProvidedReplica}} to fail if the {{PathHandle}} is present, but the underlying {{FileSystem}} claims not to support it. If the handle was obtained from the FS, it's probably a misconfiguration if it fails to resolve later. As followup, it may make sense to propagate the {{BlockAlias}}, rather than the {{FileRegion}} fields as arguments. Similarly, the {{PathHandle}} could replace the URI, since it should only resolve if its referent is unmodified. These changes are broader, and can be deferred. > [READ] Datanodes should use a unique identifier when reading from external > stores > - > > Key: HDFS-11640 > URL: https://issues.apache.org/jira/browse/HDFS-11640 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Virajith Jalaparti > Attachments: HDFS-11640-HDFS-9806.001.patch, > HDFS-11640-HDFS-9806.002.patch, HDFS-11640-HDFS-9806.003.patch > > > Use a unique identifier when reading from external stores to ensure that > datanodes read the correct (version of) file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled
[ https://issues.apache.org/jira/browse/HDFS-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272173#comment-16272173 ] genericqa commented on HDFS-12872: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 3 unchanged - 0 fixed = 5 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 17s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}153m 16s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}206m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.TestWebHDFSStoragePolicyCommands | | | hadoop.hdfs.TestDFSRollback | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | | | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.tools.TestViewFSStoragePolicyCommands | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | |
[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12665: -- Status: Patch Available (was: Open) > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12665: -- Status: Open (was: Patch Available) > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Status: Patch Available (was: Open) > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch, > HDFS-12685-HDFS-9806.004.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421) >
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Attachment: HDFS-12685-HDFS-9806.004.patch Patch v4 fixes the checkstyle errors. > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch, > HDFS-12685-HDFS-9806.004.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421)
[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
[ https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272165#comment-16272165 ] Hudson commented on HDFS-12681: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13295 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13295/]) HDFS-12681. Make HdfsLocatedFileStatus a subtype of LocatedFileStatus (cdouglas: rev 0e560f3b8d194c10dce06443979df4074e14b0db) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/protocolPB/PBHelper.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusSerialization.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsFileStatus.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsLocatedFileStatus.java * (edit) hadoop-hdfs-project/hadoop-hdfs/dev-support/findbugsExcludeFile.xml * (edit) hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java * (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsNamedFileStatus.java * (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/package-info.java * (add) hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/protocol/TestHdfsFileStatusMethods.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestStorageMover.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/dev-support/findbugsExcludeFile.xml * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/LocatedFileStatus.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockStoragePolicy.java > Make HdfsLocatedFileStatus a subtype of LocatedFileStatus > - > > Key: HDFS-12681 > URL: https://issues.apache.org/jira/browse/HDFS-12681 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chris Douglas >Assignee: Chris Douglas > Fix For: 3.1.0 > > Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, > HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, > HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, > HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, > HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch, > HDFS-12681.14.patch, HDFS-12681.15.patch, HDFS-12681.16.patch > > > {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of > {{LocatedFileStatus}}. Conversion requires copying common fields and shedding > unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to > extend {{LocatedFileStatus}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Status: Open (was: Patch Available) > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421) >
[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
[ https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272148#comment-16272148 ] Chris Douglas commented on HDFS-12681: -- Failed tests are due to resource exhaustion. {{TestUnbuffer}} failed in my environment (HADOOP-15056), but the other tests passed last time I ran them. > Make HdfsLocatedFileStatus a subtype of LocatedFileStatus > - > > Key: HDFS-12681 > URL: https://issues.apache.org/jira/browse/HDFS-12681 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chris Douglas >Assignee: Chris Douglas > Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, > HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, > HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, > HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, > HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch, > HDFS-12681.14.patch, HDFS-12681.15.patch, HDFS-12681.16.patch > > > {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of > {{LocatedFileStatus}}. Conversion requires copying common fields and shedding > unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to > extend {{LocatedFileStatus}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272137#comment-16272137 ] genericqa commented on HDFS-12665: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} HDFS-9806 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 39s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 16s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 55s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 25s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-9806 has 1 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 26s{color} | {color:red} hadoop-tools/hadoop-fs2img in HDFS-9806 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} HDFS-9806 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 55s{color} | {color:orange} root: The patch generated 8 new + 627 unchanged - 0 fixed = 635 total (was 627) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 6s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 32s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m
[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
[ https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272124#comment-16272124 ] genericqa commented on HDFS-12681: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 23s{color} | {color:orange} root: The patch generated 12 new + 410 unchanged - 6 fixed = 422 total (was 416) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 1s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 14s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 14s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}125m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 48s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 32s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}275m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestShellBasedUnixGroupsMapping | | | hadoop.hdfs.TestEncryptedTransfer | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.TestBlocksScheduledCounter | | | hadoop.hdfs.TestDFSClientFailover | | | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.TestDFSRollback | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | | |
[jira] [Commented] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272119#comment-16272119 ] genericqa commented on HDFS-12685: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} HDFS-9806 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 35s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} HDFS-9806 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} HDFS-9806 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 83 unchanged - 2 fixed = 86 total (was 85) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}120m 49s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}198m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-hdfs:3 | | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.TestEncryptionZonesWithKMS | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 | | | hadoop.hdfs.TestErasureCodingMultipleRacks | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.qjournal.client.TestQJMWithFaults | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12685 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12899911/HDFS-12685-HDFS-9806.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 94d8a8e766d0 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-9806 / 6e805c0 | | maven | version:
[jira] [Commented] (HDFS-12862) When modify cacheDirective ,editLog may serial relative expiryTime
[ https://issues.apache.org/jira/browse/HDFS-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272085#comment-16272085 ] Wang XL commented on HDFS-12862: I will post a patch later. > When modify cacheDirective ,editLog may serial relative expiryTime > -- > > Key: HDFS-12862 > URL: https://issues.apache.org/jira/browse/HDFS-12862 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, hdfs >Affects Versions: 2.7.1 > Environment: >Reporter: Wang XL > > The logic in FSNDNCacheOp#modifyCacheDirective is not correct. when modify > cacheDirective,the expiration in directive may be a relative expiryTime, and > EditLog will serial a relative expiry time. > {code:java} > // Some comments here > static void modifyCacheDirective( > FSNamesystem fsn, CacheManager cacheManager, CacheDirectiveInfo > directive, > EnumSet flags, boolean logRetryCache) throws IOException { > final FSPermissionChecker pc = getFsPermissionChecker(fsn); > cacheManager.modifyDirective(directive, pc, flags); > fsn.getEditLog().logModifyCacheDirectiveInfo(directive, logRetryCache); > } > {code} > But when SBN replay the log ,it will invoke > FSImageSerialization#readCacheDirectiveInfo as a absolute expiryTime.It will > result in the inconsistency . > {code:java} > public static CacheDirectiveInfo readCacheDirectiveInfo(DataInput in) > throws IOException { > CacheDirectiveInfo.Builder builder = > new CacheDirectiveInfo.Builder(); > builder.setId(readLong(in)); > int flags = in.readInt(); > if ((flags & 0x1) != 0) { > builder.setPath(new Path(readString(in))); > } > if ((flags & 0x2) != 0) { > builder.setReplication(readShort(in)); > } > if ((flags & 0x4) != 0) { > builder.setPool(readString(in)); > } > if ((flags & 0x8) != 0) { > builder.setExpiration( > CacheDirectiveInfo.Expiration.newAbsolute(readLong(in))); > } > if ((flags & ~0xF) != 0) { > throw new IOException("unknown flags set in " + > "ModifyCacheDirectiveInfoOp: " + flags); > } > return builder.build(); > } > {code} > In other words, fsn.getEditLog().logModifyCacheDirectiveInfo(directive, > logRetryCache) may serial a relative expiry time,But > builder.setExpiration(CacheDirectiveInfo.Expiration.newAbsolute(readLong(in))) >read it as a absolute expiryTime. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets
[ https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272083#comment-16272083 ] Erik Krogen commented on HDFS-12638: I investigated this further and think that Konstantin's v002 patch should actually solve the problem. Actually, HDFS-9754 does not change the invariant that [~shv] mentioned. A few notes from the investigation: * After incremental block deletion was added, it was already true that a block which was not associated with a valid INode could be present in the blocksMap. In {{FSNamesystem#delete()}}, we first call {{FSDirDeleteOp#delete()}} within the write lock. Then release the write lock, then call {{BlockManager#removeBlock()}} (will remove the block from blocksMap) on each block later on. Within {{FSDirDeleteOp#delete()}}, all INodes being deleted are removed from the inodesMap (see {{FSDirDeleteOp#deleteInternal()}} which calls {{FSNamesystem#removeLeasesAndINodes()}}). * This scenario meant that places e.g. {{BlockManager#scheduleReconstruction()}} had to check if there was a BlockCollection associated with the block, which it previously did by checking {{FSNamesystem#getBlockCollection(blkInfo.getBlockCollectionId()) != null}}. Now, HDFS-9754 replaced this call with {{BlockInfo#isDeleted()}}, so this means whenever we remove an INode from the inodesMap, we need to call {{BlockInfo#delete()}} to indicate that it does not have a valid BlockCollection associated with it (this is currently done within {{INodeFile#clearFile()}}, called by {{INode#destroyAndCollectBlocks()}}, called by {{FSDirDeleteOp#unprotectedDelete()}}). * When HDFS-9754 was added, it did not properly mark copy-on-truncate blocks with {{BlockInfo#delete()}}, so the {{BlockInfo#isDeleted()}} check would fail, thus causing {{BlockManager#secheduleReconstruction()}} to throw NPE when it tries to use {{FSNamesystem#getBlockCollection(blkInfo)}} (since it assumes there is a valid block collection associated). * Konstantin's patch correctly invalidates copy-on-truncate blocks, so should fix this NPE, at least for the case of copy-on-truncate blocks. So +1 from me (non-binding) on the logic of the v002 patch. We should also try to get some unit test in for this. > NameNode exits due to ReplicationMonitor thread received Runtime exception in > ReplicationWork#chooseTargets > --- > > Key: HDFS-12638 > URL: https://issues.apache.org/jira/browse/HDFS-12638 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.2 >Reporter: Jiandan Yang >Priority: Blocker > Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, > OphanBlocksAfterTruncateDelete.jpg > > > Active NamNode exit due to NPE, I can confirm that the BlockCollection passed > in when creating ReplicationWork is null, but I do not know why > BlockCollection is null, By view history I found > [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging > whether BlockCollection is null. > NN logs are as following: > {code:java} > 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > ReplicationMonitor thread received Runtime exception. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval
[ https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272041#comment-16272041 ] genericqa commented on HDFS-11576: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 57s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 18s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 15s{color} | {color:orange} root: The patch generated 1 new + 375 unchanged - 0 fixed = 376 total (was 375) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 39s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 48s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}128m 14s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 37s{color} | {color:red} The patch generated 4 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}262m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Inconsistent synchronization of org.apache.hadoop.hdfs.server.blockmanagement.PendingRecoveryBlocks.recoveryTimeoutInterval; locked 66% of time Unsynchronized access at PendingRecoveryBlocks.java:66% of time Unsynchronized access at PendingRecoveryBlocks.java:[line 103] | | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy | | | hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.crypto.TestHdfsCryptoStreams | | |
[jira] [Comment Edited] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271877#comment-16271877 ] Virajith Jalaparti edited comment on HDFS-12665 at 11/30/17 1:50 AM: - Updated patch which is rebased on HDFS-9806 feature branch. was (Author: virajith): Updated patch which is rebased on 9806 feature branch. > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12051) Intern INOdeFileAttributes$SnapshotCopy.name byte[] arrays to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271991#comment-16271991 ] genericqa commented on HDFS-12051: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 634 unchanged - 17 fixed = 641 total (was 651) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 50s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}168m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestFileAppend2 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 | | | hadoop.fs.TestUnbuffer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDatanodeDeath | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | | | hadoop.hdfs.TestBlocksScheduledCounter | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12051 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12876918/HDFS-12051.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ce095bc4a720 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 333ef30 | | maven | version: Apache Maven 3.3.9 | |
[jira] [Updated] (HDFS-12873) Creating a '..' directory is possible using inode paths
[ https://issues.apache.org/jira/browse/HDFS-12873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raeanne J Marks updated HDFS-12873: --- Description: Start with a fresh deployment of HDFS. 1. Mkdirs '/x/y/z' 2. use GetFileInfo to get y's inode number 3. Mkdirs '/.reserved/.inodes//z/../foo' Expectation: The path in step 3 is rejected as invalid (exception thrown) OR foo would be created under y. Observation: This created a directory called '..' under z and 'foo' under that '..' directory instead of consolidating the path to '/x/y/foo' or throwing an exception. GetListing on '/.reserved/.inodes/ ' shows '..', while GetListing on '/x/y' does not. Mkdirs INotify events were reported with the following paths, in order: /x /x/y /x/y/z /x/y/z/.. /x/y/z/../foo I can also chain these dotdot directories and make them as deep as I want. Mkdirs works with the following paths appended to the inode path for directory y: '/z/../../../foo', '/z/../../../../../', '/z/../../../foo/bar/../..' etc, and it constructs all the '..' directories as if they weren't special names. was: Start with a fresh deployment of HDFS. 1. `Mkdirs '/x/y/z'` 2. use `GetFileInfo` to get y's inode number 3. `Mkdirs '/.reserved/.inodes/ /z/../foo'` Expectation: The path in step 3 is rejected as invalid (exception thrown) OR `foo` would be created under `y`. Observation: This created a directory called `..` under `z` and `foo` under that `..` directory instead of consolidating the path to `/x/y/foo` or throwing an exception. `GetListing` on `/.reserved/.inodes/ ` shows `..`, while `GetListing` on `/x/y` does not. `Mkdirs` INotify events were reported with the following paths, in order: ``` /x /x/y /x/y/z /x/y/z/.. /x/y/z/../foo ``` I can also chain these dotdot directories and make them as deep as I want. `Mkdirs` works with the following paths appended to the inode path for directory `y`: `/z/../../../foo`, `/z/../../../../../`, `/z/../../../foo/bar/../..` etc, and it constructs all the `..` directories as if they weren't special names. > Creating a '..' directory is possible using inode paths > --- > > Key: HDFS-12873 > URL: https://issues.apache.org/jira/browse/HDFS-12873 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.8.0 > Environment: Apache NameNode running in a Docker container on a > Fedora 25 workstation. >Reporter: Raeanne J Marks > > Start with a fresh deployment of HDFS. > 1. Mkdirs '/x/y/z' > 2. use GetFileInfo to get y's inode number > 3. Mkdirs '/.reserved/.inodes/ /z/../foo' > Expectation: The path in step 3 is rejected as invalid (exception thrown) OR > foo would be created under y. > Observation: This created a directory called '..' under z and 'foo' under > that '..' directory instead of consolidating the path to '/x/y/foo' or > throwing an exception. GetListing on '/.reserved/.inodes/ ' > shows '..', while GetListing on '/x/y' does not. > Mkdirs INotify events were reported with the following paths, in order: > /x > /x/y > /x/y/z > /x/y/z/.. > /x/y/z/../foo > I can also chain these dotdot directories and make them as deep as I want. > Mkdirs works with the following paths appended to the inode path for > directory y: '/z/../../../foo', '/z/../../../../../', > '/z/../../../foo/bar/../..' etc, and it constructs all the '..' directories > as if they weren't special names. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12873) Creating a '..' directory is possible using inode paths
[ https://issues.apache.org/jira/browse/HDFS-12873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raeanne J Marks updated HDFS-12873: --- Description: Start with a fresh deployment of HDFS. 1. `Mkdirs '/x/y/z'` 2. use `GetFileInfo` to get y's inode number 3. `Mkdirs '/.reserved/.inodes//z/../foo'` Expectation: The path in step 3 is rejected as invalid (exception thrown) OR `foo` would be created under `y`. Observation: This created a directory called `..` under `z` and `foo` under that `..` directory instead of consolidating the path to `/x/y/foo` or throwing an exception. `GetListing` on `/.reserved/.inodes/ ` shows `..`, while `GetListing` on `/x/y` does not. `Mkdirs` INotify events were reported with the following paths, in order: ``` /x /x/y /x/y/z /x/y/z/.. /x/y/z/../foo ``` I can also chain these dotdot directories and make them as deep as I want. `Mkdirs` works with the following paths appended to the inode path for directory `y`: `/z/../../../foo`, `/z/../../../../../`, `/z/../../../foo/bar/../..` etc, and it constructs all the `..` directories as if they weren't special names. was: Start with a fresh deployment of HDFS. 1. Mkdirs '/x/y/z' 2. use GetFileInfo to get y's inode number 3. Mkdirs '/.reserved/.inodes/ /z/../foo' Expectation: The path in step 3 is rejected as invalid (exception thrown) OR foo would be created under y. Observation: This created a directory called '..' under z and 'foo' under that '..' directory instead of consolidating the path to '/x/y/foo' or throwing an exception. GetListing on '/.reserved/.inodes/ ' shows '..', while GetListing on '/x/y' does not. Mkdirs INotify events were reported with the following paths, in order: /x /x/y /x/y/z /x/y/z/.. /x/y/z/../foo I can also chain these dotdot directories and make them as deep as I want. Mkdirs works with the following paths appended to the inode path for directory y: '/z/../../../foo', '/z/../../../../../', '/z/../../../foo/bar/../..' etc, and it constructs all the '..' directories as if they weren't special names. > Creating a '..' directory is possible using inode paths > --- > > Key: HDFS-12873 > URL: https://issues.apache.org/jira/browse/HDFS-12873 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.8.0 > Environment: Apache NameNode running in a Docker container on a > Fedora 25 workstation. >Reporter: Raeanne J Marks > > Start with a fresh deployment of HDFS. > 1. `Mkdirs '/x/y/z'` > 2. use `GetFileInfo` to get y's inode number > 3. `Mkdirs '/.reserved/.inodes/ /z/../foo'` > Expectation: The path in step 3 is rejected as invalid (exception thrown) OR > `foo` would be created under `y`. > Observation: This created a directory called `..` under `z` and `foo` under > that `..` directory instead of consolidating the path to `/x/y/foo` or > throwing an exception. `GetListing` on `/.reserved/.inodes/ number>` shows `..`, while `GetListing` on `/x/y` does not. > `Mkdirs` INotify events were reported with the following paths, in order: > ``` > /x > /x/y > /x/y/z > /x/y/z/.. > /x/y/z/../foo > ``` > I can also chain these dotdot directories and make them as deep as I want. > `Mkdirs` works with the following paths appended to the inode path for > directory `y`: `/z/../../../foo`, `/z/../../../../../`, > `/z/../../../foo/bar/../..` etc, and it constructs all the `..` directories > as if they weren't special names. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12873) Creating a '..' directory is possible using inode paths
Raeanne J Marks created HDFS-12873: -- Summary: Creating a '..' directory is possible using inode paths Key: HDFS-12873 URL: https://issues.apache.org/jira/browse/HDFS-12873 Project: Hadoop HDFS Issue Type: Bug Components: hdfs, namenode Affects Versions: 2.8.0 Environment: Apache NameNode running in a Docker container on a Fedora 25 workstation. Reporter: Raeanne J Marks Start with a fresh deployment of HDFS. 1. Mkdirs '/x/y/z' 2. use GetFileInfo to get y's inode number 3. Mkdirs '/.reserved/.inodes//z/../foo' Expectation: The path in step 3 is rejected as invalid (exception thrown) OR foo would be created under y. Observation: This created a directory called '..' under z and 'foo' under that '..' directory instead of consolidating the path to '/x/y/foo' or throwing an exception. GetListing on '/.reserved/.inodes/ ' shows '..', while GetListing on '/x/y' does not. Mkdirs INotify events were reported with the following paths, in order: /x /x/y /x/y/z /x/y/z/.. /x/y/z/../foo I can also chain these dotdot directories and make them as deep as I want. Mkdirs works with the following paths appended to the inode path for directory y: '/z/../../../foo', '/z/../../../../../', '/z/../../../foo/bar/../..' etc, and it constructs all the '..' directories as if they weren't special names. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log
[ https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271923#comment-16271923 ] Chao Sun commented on HDFS-12836: - [~jojochuang] The checkstyle passed but there are some failures again in the latest jenkins run. They do not look like related: most are failed with {{java.lang.OutOfMemoryError: unable to create new native thread}}. Can you double check and let me know if the +1 still holds? Thanks! > startTxId could be greater than endTxId when tailing in-progress edit log > - > > Key: HDFS-12836 > URL: https://issues.apache.org/jira/browse/HDFS-12836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.0-alpha1 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HDFS-12836.1.patch, HDFS-12836.2.patch, > HDFS-12836.3.patch > > > When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also > tail those in progress edit log segments. However, in the following code: > {code} > if (onlyDurableTxns && inProgressOk) { > endTxId = Math.min(endTxId, committedTxnId); > } > EditLogInputStream elis = EditLogFileInputStream.fromUrl( > connectionFactory, url, remoteLog.getStartTxId(), > endTxId, remoteLog.isInProgress()); > {code} > it is possible that {{remoteLog.getStartTxId()}} could be greater than > {{endTxId}}, and therefore will cause the following error: > {code} > 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: > Error replaying edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: > got premature end-of-file at txid 86; expected file to go up to 85 > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) > 2017-11-17 19:55:41,165 WARN > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading > edits from disk. Will try again. > org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying > edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at >
[jira] [Updated] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled
[ https://issues.apache.org/jira/browse/HDFS-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12872: - Status: Patch Available (was: Open) > EC Checksum broken when BlockAccessToken is enabled > --- > > Key: HDFS-12872 > URL: https://issues.apache.org/jira/browse/HDFS-12872 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12872.repro.patch > > > It appears {{hdfs ec -checksum}} doesn't work when block access token is > enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled
[ https://issues.apache.org/jira/browse/HDFS-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271922#comment-16271922 ] Xiao Chen commented on HDFS-12872: -- Hi [~eddyxu] [~drankye] [~umamaheswararao], Have some questions on this one. Appreciate your inputs. I'm attaching a patch for discussion. The test fails without my attempted fix, passes with. My understanding is the checksum has to be done at the blockgroup level, by calculating each internal block in the group. From the following code, it appears this is done on a single datanode instead of by the client. (DN later will calculate in {{DataXceiver#blockGroupChecksum}} by running the computers). So it would be {{done}} as long as it can get result from any 1 DN. {code} boolean done = false; for (int j = 0; !done && j < datanodes.length; j++) { try { tryDatanode(blockGroup, stripedBlockInfo, datanodes[j], requestedNumBytes); done = true; } catch (InvalidBlockTokenException ibte) { if (bgIdx > getLastRetriedIndex()) { setLastRetriedIndex(bgIdx); done = true; // actually it's not done; but we'll retry bgIdx--; // repeat at bgIdx-th block setRefetchBlocks(true); } } catch (IOException ie) { LOG.warn("src={}" + ", datanodes[{}]={}", getSrc(), j, datanodes[j], ie); } } {code} So it appears this bug is just that we'd want to always use the first blocktoken (index==0) to authenticate with the datanodes, since the blockgroup's id will be equal to the id of the first internal block. Is my understanding above correct? I think for testing purpose we may want to make sure this works for all DNs, not 1 DN, in unit tests. (Verified the posted patch works by removing {{!done &&}}, and look for {{got reply from }} messages. A related question: I was trying to look at the actual storage of these blocks. This was done by first run the unit test to clean up local dir, then set a break point in the end of the test. It seems to me, listing the local dir I always get the same blocks on different datanodes (e.g. 9223372036854775792, 9223372036854775791 and 9223372036854775790 are the same). Is this by design? Why are they the same if they're storing different cells? {noformat} xiao-MBP:hadoop xiao$ for f in $(find /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/ -name *blk*);do ls -lh $f ;done -rw-r--r-- 1 xiao staff 128M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data1/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775792 -rw-r--r-- 1 xiao staff 1.0M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data1/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775792_1001.meta -rw-r--r-- 1 xiao staff72M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data2/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775774 -rw-r--r-- 1 xiao staff 576K Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data2/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775774_1002.meta -rw-r--r-- 1 xiao staff 128M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data3/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775791 -rw-r--r-- 1 xiao staff 1.0M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data3/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775791_1001.meta -rw-r--r-- 1 xiao staff72M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data4/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775776 -rw-r--r-- 1 xiao staff 576K Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data4/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775776_1002.meta -rw-r--r-- 1 xiao staff 128M Nov 29 15:54 /Users/xiao/Desktop/repo/hdfs/xiao/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data//data5/current/BP-50698545-10.0.0.51-1511999667872/current/finalized/subdir0/subdir0/blk_-9223372036854775790 -rw-r--r-- 1 xiao staff 1.0M Nov 29 15:54
[jira] [Updated] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled
[ https://issues.apache.org/jira/browse/HDFS-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12872: - Attachment: HDFS-12872.repro.patch > EC Checksum broken when BlockAccessToken is enabled > --- > > Key: HDFS-12872 > URL: https://issues.apache.org/jira/browse/HDFS-12872 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12872.repro.patch > > > It appears {{hdfs ec -checksum}} doesn't work when block access token is > enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log
[ https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271893#comment-16271893 ] genericqa commented on HDFS-12836: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 42s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 30s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}147m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 | | | hadoop.fs.TestHDFSFileContextMainOperations | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshot | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicy | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancerWithMockMover | | | hadoop.hdfs.server.namenode.TestReencryption | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12836 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12899881/HDFS-12836.3.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 34814f69f43b 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk
[jira] [Updated] (HDFS-12000) Ozone: Container : Add key versioning support-1
[ https://issues.apache.org/jira/browse/HDFS-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-12000: -- Attachment: HDFS-12000-HDFS-7240.007.patch update v007 patch. rewrote many places as lots of code has changed since previous patch. post v007 patch to trigger Jenkins, still doing more testing locally. Please note that this is still work in progress. This patch only focuses on server side. Specifically, this patch adds versioning to blocks and persistent the information of the older versions to meta store. But from client side, the read and write are hidden from versioning i.e. write still always rewrites the whole key, and read still always reads only the most recently committed version of the key. Will follow up with another JIRA for more read/write change. > Ozone: Container : Add key versioning support-1 > --- > > Key: HDFS-12000 > URL: https://issues.apache.org/jira/browse/HDFS-12000 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Chen Liang > Labels: OzonePostMerge > Attachments: HDFS-12000-HDFS-7240.001.patch, > HDFS-12000-HDFS-7240.002.patch, HDFS-12000-HDFS-7240.003.patch, > HDFS-12000-HDFS-7240.004.patch, HDFS-12000-HDFS-7240.005.patch, > HDFS-12000-HDFS-7240.007.patch, OzoneVersion.001.pdf > > > The rest interface of ozone supports versioning of keys. This support comes > from the containers and how chunks are managed to support this feature. This > JIRA tracks that feature. Will post a detailed design doc so that we can talk > about this feature. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12665: -- Status: Open (was: Patch Available) > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12665: -- Status: Patch Available (was: Open) > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)
[ https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12665: -- Attachment: HDFS-12665-HDFS-9806.012.patch Updated patch which is rebased on 9806 feature branch. > [AliasMap] Create a version of the AliasMap that runs in memory in the > Namenode (leveldb) > - > > Key: HDFS-12665 > URL: https://issues.apache.org/jira/browse/HDFS-12665 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Ewan Higgs > Attachments: HDFS-12665-HDFS-9806.001.patch, > HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, > HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, > HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, > HDFS-12665-HDFS-9806.008.patch, HDFS-12665-HDFS-9806.009.patch, > HDFS-12665-HDFS-9806.010.patch, HDFS-12665-HDFS-9806.011.patch, > HDFS-12665-HDFS-9806.012.patch > > > The design of Provided Storage requires the use of an AliasMap to manage the > mapping between blocks of files on the local HDFS and ranges of files on a > remote storage system. To reduce load from the Namenode, this can be done > using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). > However, to aide adoption and ease of deployment, we propose an in memory > version. > This AliasMap will be a wrapper around LevelDB (already a dependency from the > Timeline Service) and use protobuf for the key (blockpool, blockid, and > genstamp) and the value (url, offset, length, nonce). The in memory service > will also have a configurable port on which it will listen for updates from > Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Status: Open (was: Patch Available) > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421) >
[jira] [Commented] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271865#comment-16271865 ] Virajith Jalaparti commented on HDFS-12685: --- In patch v2, the {{ScanInfo.blockLength}} is passed in as a parameter for the new constructor. So, it is set to whatever is the value passed it. I think this is fine. Anyway, to ensure that it is understood that {{DirectoryScanner}} is disabled on {{PROVIDED}} volumes, I added a new test case in v3. In the future, if {{DirectoryScanner}} is enabled on {{PROVIDED}} this test will fail, and has to be modified appropriately. > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > >
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Status: Patch Available (was: Open) > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421) >
[jira] [Updated] (HDFS-12685) [READ] FsVolumeImpl exception when scanning Provided storage volume
[ https://issues.apache.org/jira/browse/HDFS-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12685: -- Attachment: HDFS-12685-HDFS-9806.003.patch > [READ] FsVolumeImpl exception when scanning Provided storage volume > --- > > Key: HDFS-12685 > URL: https://issues.apache.org/jira/browse/HDFS-12685 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ewan Higgs >Assignee: Virajith Jalaparti > Attachments: HDFS-12685-HDFS-9806.001.patch, > HDFS-12685-HDFS-9806.002.patch, HDFS-12685-HDFS-9806.003.patch > > > I left a Datanode running overnight and found this in the logs in the morning: > {code} > 2017-10-18 23:51:54,391 ERROR datanode.DirectoryScanner: Error compiling > report for the volume, StorageId: DS-e75ebc3c-6b12-424e-875a-a4ae1a4dcc29 > > > java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: > URI scheme is not "file" > > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > > > at > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.lang.IllegalArgumentException: URI scheme is not "file" > > > > at java.io.File.(File.java:421) >
[jira] [Updated] (HDFS-9240) Use Builder pattern for BlockLocation constructors
[ https://issues.apache.org/jira/browse/HDFS-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-9240: - Status: Patch Available (was: Open) > Use Builder pattern for BlockLocation constructors > -- > > Key: HDFS-9240 > URL: https://issues.apache.org/jira/browse/HDFS-9240 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Xiaoyu Yao >Assignee: Virajith Jalaparti >Priority: Minor > Attachments: HDFS-9240.001.patch, HDFS-9240.002.patch, > HDFS-9240.003.patch > > > This JIRA is opened to refactor the 8 telescoping constructors of > BlockLocation class with Builder pattern. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9240) Use Builder pattern for BlockLocation constructors
[ https://issues.apache.org/jira/browse/HDFS-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271859#comment-16271859 ] Virajith Jalaparti commented on HDFS-9240: -- Looks like jenkins missed this one. Triggering it again. > Use Builder pattern for BlockLocation constructors > -- > > Key: HDFS-9240 > URL: https://issues.apache.org/jira/browse/HDFS-9240 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Xiaoyu Yao >Assignee: Virajith Jalaparti >Priority: Minor > Attachments: HDFS-9240.001.patch, HDFS-9240.002.patch, > HDFS-9240.003.patch > > > This JIRA is opened to refactor the 8 telescoping constructors of > BlockLocation class with Builder pattern. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9240) Use Builder pattern for BlockLocation constructors
[ https://issues.apache.org/jira/browse/HDFS-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-9240: - Status: Open (was: Patch Available) > Use Builder pattern for BlockLocation constructors > -- > > Key: HDFS-9240 > URL: https://issues.apache.org/jira/browse/HDFS-9240 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Xiaoyu Yao >Assignee: Virajith Jalaparti >Priority: Minor > Attachments: HDFS-9240.001.patch, HDFS-9240.002.patch, > HDFS-9240.003.patch > > > This JIRA is opened to refactor the 8 telescoping constructors of > BlockLocation class with Builder pattern. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12594) SnapshotDiff - snapshotDiff fails if the snapshotDiff report exceeds the RPC response limit
[ https://issues.apache.org/jira/browse/HDFS-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271746#comment-16271746 ] Tsz Wo Nicholas Sze commented on HDFS-12594: Patch looks good. - Please fix the checkstyle warnings. - Remove the empty line added to findbugsExcludeFile.xml - Do you want to add also testDiffReportWithMillionFiles since you already have it? > SnapshotDiff - snapshotDiff fails if the snapshotDiff report exceeds the RPC > response limit > --- > > Key: HDFS-12594 > URL: https://issues.apache.org/jira/browse/HDFS-12594 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee > Attachments: HDFS-12594.001.patch, HDFS-12594.002.patch, > HDFS-12594.003.patch, HDFS-12594.004.patch, HDFS-12594.005.patch, > HDFS-12594.006.patch, HDFS-12594.007.patch, HDFS-12594.008.patch, > HDFS-12594.009.patch, SnapshotDiff_Improvemnets .pdf > > > The snapshotDiff command fails if the snapshotDiff report size is larger than > the configuration value of ipc.maximum.response.length which is by default > 128 MB. > Worst case, with all Renames ops in sanpshots each with source and target > name equal to MAX_PATH_LEN which is 8k characters, this would result in at > 8192 renames. > > SnapshotDiff is currently used by distcp to optimize copy operations and in > case of the the diff report exceeding the limit , it fails with the below > exception: > Test set: > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport > --- > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 112.095 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport > testDiffReportWithMillionFiles(org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport) > Time elapsed: 111.906 sec <<< ERROR! > java.io.IOException: Failed on local exception: > org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; > Host Details : local host is: "hw15685.local/10.200.5.230"; destination host > is: "localhost":59808; > Attached is the proposal for the changes required. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled
Xiao Chen created HDFS-12872: Summary: EC Checksum broken when BlockAccessToken is enabled Key: HDFS-12872 URL: https://issues.apache.org/jira/browse/HDFS-12872 Project: Hadoop HDFS Issue Type: Bug Components: erasure-coding Reporter: Xiao Chen Assignee: Xiao Chen Priority: Critical It appears {{hdfs ec -checksum}} doesn't work when block access token is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12866) Recursive delete of a large directory or snapshot makes namenode unresponsive
[ https://issues.apache.org/jira/browse/HDFS-12866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271732#comment-16271732 ] Yongjun Zhang edited comment on HDFS-12866 at 11/29/17 11:04 PM: - Thanks much for the feedback [~kihwal] and [~daryn]. Yes, with snapshot, an INode won't be disconnected when it's in previous snapshot, it will be repaced with INodeRefenrence instead. So we can use a bit in the INode to indicate whether it's "disconnected" instead of physical disconnection from the parent. Indeed I was thinking traversing to the root to check, like done in {{FSNamesystem#isFileDeleted}}, it cost some time, but we can find if an INode is disconnected, right? Optimizing the permission checking etc would help, however, without postponing the deletion work to later, if the tree is large enough, we can still hit this problem. Quota computation will still be done, just not done right away. Hopefully that won't cause too much problem? So the main issue of this approach is the cost of traversing to the root to check if any ancestor is disconnected? I wonder how bad it is. In IBR and FBR, can we assume the file exists if the INode is there? The block deletion step (step 2) will get them removed later anyways. Thanks again. was (Author: yzhangal): Thanks much for the feedback [~kihwal] and [~daryn]. Yes, with snapshot, an INode won't be disconnected when it's in previous snapshot, it will be repaced with INodeRefenrence instead. So we can use a bit in the INode to indicate whether it's "disconnected" instead of physical disconnection from the parent. Indeed I was thinking traversing to the root to check, like done in {{FSNamesystem#isFileDeleted}}, it cost some time, but we can find if an INode is disconnected, right? Optimizing the permission checking etc would help, however, without postponing the deletion work to later, if the tree is large enough, we can still hit this problem. Quota computation will still be done, just not done right away. Hopefully that won't cause too much problem? So the main issue of this approach is the cost of traversing to the root to check if any ancestor is disconnected? I wonder how bad it is. In IBR and FBR, can we assume the file exists in the INode is there? The block deletion step (step 2) will get them removed later anyways. Thanks again. > Recursive delete of a large directory or snapshot makes namenode unresponsive > - > > Key: HDFS-12866 > URL: https://issues.apache.org/jira/browse/HDFS-12866 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang > > Currently file/directory deletion happens in two steps (see > {{FSNamesystem#delete(String src, boolean recursive, boolean logRetryCache)}}: > # Do the following under fsn write lock and release the lock afterwards > ** 1.1 recursively traverse the target, collect INodes and all blocks to be > deleted > ** 1.2 delete all INodes > # Delete the blocks to be deleted incrementally, chunk by chunk. That is, in > a loop, do: > ** acquire fsn write lock, > ** delete chunk of blocks > ** release fsn write lock > Breaking the deletion to two steps is to not hold the fsn write lock for too > long thus making NN not responsive. However, even with this, for deleting > large directory, or deleting snapshot that has a lot of contents, step 1 > itself would takes long time thus still hold the fsn write lock for too long > and make NN not responsive. > A possible solution would be to add one more sub step in step 1, and only > hold fsn write lock in sub step 1.1: > * 1.1. hold the fsn write lock, disconnect the target to be deleted from its > parent dir, release the lock > * 1.2 recursively traverse the target, collect INodes and all blocks to be > deleted > * 1.3 delete all INodes > Then do step 2. > This means, any operations on any file/dir need to check if its ancestor is > deleted (ancestor is disconnected), similar to what's done in > FSNamesystem#isFileDeleted method. > I'm throwing the thought here for further discussion. Welcome comments and > inputs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12866) Recursive delete of a large directory or snapshot makes namenode unresponsive
[ https://issues.apache.org/jira/browse/HDFS-12866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271732#comment-16271732 ] Yongjun Zhang commented on HDFS-12866: -- Thanks much for the feedback [~kihwal] and [~daryn]. Yes, with snapshot, an INode won't be disconnected when it's in previous snapshot, it will be repaced with INodeRefenrence instead. So we can use a bit in the INode to indicate whether it's "disconnected" instead of physical disconnection from the parent. Indeed I was thinking traversing to the root to check, like done in {{FSNamesystem#isFileDeleted}}, it cost some time, but we can find if an INode is disconnected, right? Optimizing the permission checking etc would help, however, without postponing the deletion work to later, if the tree is large enough, we can still hit this problem. Quota computation will still be done, just not done right away. Hopefully that won't cause too much problem? So the main issue of this approach is the cost of traversing to the root to check if any ancestor is disconnected? I wonder how bad it is. In IBR and FBR, can we assume the file exists in the INode is there? The block deletion step (step 2) will get them removed later anyways. Thanks again. > Recursive delete of a large directory or snapshot makes namenode unresponsive > - > > Key: HDFS-12866 > URL: https://issues.apache.org/jira/browse/HDFS-12866 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang > > Currently file/directory deletion happens in two steps (see > {{FSNamesystem#delete(String src, boolean recursive, boolean logRetryCache)}}: > # Do the following under fsn write lock and release the lock afterwards > ** 1.1 recursively traverse the target, collect INodes and all blocks to be > deleted > ** 1.2 delete all INodes > # Delete the blocks to be deleted incrementally, chunk by chunk. That is, in > a loop, do: > ** acquire fsn write lock, > ** delete chunk of blocks > ** release fsn write lock > Breaking the deletion to two steps is to not hold the fsn write lock for too > long thus making NN not responsive. However, even with this, for deleting > large directory, or deleting snapshot that has a lot of contents, step 1 > itself would takes long time thus still hold the fsn write lock for too long > and make NN not responsive. > A possible solution would be to add one more sub step in step 1, and only > hold fsn write lock in sub step 1.1: > * 1.1. hold the fsn write lock, disconnect the target to be deleted from its > parent dir, release the lock > * 1.2 recursively traverse the target, collect INodes and all blocks to be > deleted > * 1.3 delete all INodes > Then do step 2. > This means, any operations on any file/dir need to check if its ancestor is > deleted (ancestor is disconnected), similar to what's done in > FSNamesystem#isFileDeleted method. > I'm throwing the thought here for further discussion. Welcome comments and > inputs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
[ https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-12681: - Attachment: HDFS-12681.16.patch Soright, will try the same patch, see if Jenkins is feeling more generous this time. > Make HdfsLocatedFileStatus a subtype of LocatedFileStatus > - > > Key: HDFS-12681 > URL: https://issues.apache.org/jira/browse/HDFS-12681 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chris Douglas >Assignee: Chris Douglas > Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, > HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, > HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, > HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, > HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch, > HDFS-12681.14.patch, HDFS-12681.15.patch, HDFS-12681.16.patch > > > {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of > {{LocatedFileStatus}}. Conversion requires copying common fields and shedding > unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to > extend {{LocatedFileStatus}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12051) Intern INOdeFileAttributes$SnapshotCopy.name byte[] arrays to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271687#comment-16271687 ] Yongjun Zhang commented on HDFS-12051: -- HI [~mi...@cloudera.com], Thanks for working on this issue. I did a review and have the following high level comments: 1. The original NameCache works like this, when loading fsimage, it put names into a transient cache and remember the counts of each name, if the count of a name reachs a threshold (configurable with default 10), it promote the name to the permanent cache. After fsimage is loaded, it will clean up the transient cache and freeze the final cache. The problem described here is about calculating snapshotdiff which happens after fsimage loading. Thus any new name, even if it appears many times, would not benefit from the NameCache. Let's call this solution1, your change is to always allow the cache to be updated, and let's call it solution2. 2. If we modify solution1 to keep updating the cache instead of freezing it, we have chance to help the problem to solve here, however, depending on the threshold, the number of entries in the final cache of solution1 can be very different, thus memory footprint can be very different. 3. The cache size to be configured in solution2 would impact the final memory footprint too. If it's configured too small, we might end up many duplicates too. So having a reasonable default configuration would be important. It's so internal that we may not easily make good recommendation to users when to adjust it. 4. How much memory we are saving when saying "8.5% reduction"? 5. "In practice most of the time some names occur much more frequently than others". Wonder if you have examples from the case you studied, why some Names appear so much more than others, what patterns the names have? is it an artifact of snapshot implementation? 6. Solution2 might benefit some cases, but make other cases worse. If we decide to proceed, wonder if we can make both solution1 and solution2 available, and make it switchable when needed. 7. Suggest to add more comments in code. For example. {{for (int colsnChainLen = 0; colsnChainLen < 5; colsnChainLen++) {}}, what this does, and why "5". Thanks. > Intern INOdeFileAttributes$SnapshotCopy.name byte[] arrays to save memory > - > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- >
[jira] [Updated] (HDFS-12802) RBF: Control MountTableResolver cache size
[ https://issues.apache.org/jira/browse/HDFS-12802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-12802: --- Attachment: HDFS-12802.000.patch > RBF: Control MountTableResolver cache size > -- > > Key: HDFS-12802 > URL: https://issues.apache.org/jira/browse/HDFS-12802 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: HDFS-12802.000.patch > > > Currently, the {{MountTableResolver}} caches the resolutions for the > {{PathLocation}}. However, this cache can grow with no limits if there are a > lot of unique paths. Some of these cached resolutions might not be used at > all. > The {{MountTableResolver}} should clean the {{locationCache}} periodically. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271635#comment-16271635 ] Hudson commented on HDFS-11754: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13292 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13292/]) HDFS-11754. Make FsServerDefaults cache configurable. Contributed by (kihwal: rev 53509f295b5274059541565d7216bf98aa35347d) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Fix For: 3.0.0, 3.1.0, 2.10.0, 2.9.1, 2.8.4 > > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log
[ https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HDFS-12836: Attachment: HDFS-12836.3.patch Thanks [~jojochuang]. Attaching patch v3 to address the checkstyle issue. > startTxId could be greater than endTxId when tailing in-progress edit log > - > > Key: HDFS-12836 > URL: https://issues.apache.org/jira/browse/HDFS-12836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.0-alpha1 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HDFS-12836.1.patch, HDFS-12836.2.patch, > HDFS-12836.3.patch > > > When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also > tail those in progress edit log segments. However, in the following code: > {code} > if (onlyDurableTxns && inProgressOk) { > endTxId = Math.min(endTxId, committedTxnId); > } > EditLogInputStream elis = EditLogFileInputStream.fromUrl( > connectionFactory, url, remoteLog.getStartTxId(), > endTxId, remoteLog.isInProgress()); > {code} > it is possible that {{remoteLog.getStartTxId()}} could be greater than > {{endTxId}}, and therefore will cause the following error: > {code} > 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: > Error replaying edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: > got premature end-of-file at txid 86; expected file to go up to 85 > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) > 2017-11-17 19:55:41,165 WARN > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading > edits from disk. Will try again. > org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying > edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) > Caused by: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: > got premature end-of-file at txid 86; expected file to go up to 85 > at
[jira] [Resolved] (HDFS-12604) StreamCapability enums are not displayed in javadoc
[ https://issues.apache.org/jira/browse/HDFS-12604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge resolved HDFS-12604. --- Resolution: Won't Fix StreamCapability enums are deprecated by HADOOP-15012. > StreamCapability enums are not displayed in javadoc > --- > > Key: HDFS-12604 > URL: https://issues.apache.org/jira/browse/HDFS-12604 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Minor > > http://hadoop.apache.org/docs/r3.0.0-beta1/api/org/apache/hadoop/fs/StreamCapabilities.html > {{StreamCapability#HFLUSH}} and {{StreamCapability#HSYNC}} are not displayed > in the doc. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval
[ https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271561#comment-16271561 ] Lukas Majercak commented on HDFS-11576: --- Hi [~shv], [~chris.douglas], I've uploaded 012.patch to address some of Konstantin's comments: # Removed BLOCK_RECOVERY_TIMEOUT_MULTIPLIER from DFSConfigKeys and added it as a constant to BlockManager # For the log message; the start of the recovery is logged in internalReleaseLease and every rejected attempt is also logged in PendingRecoveryBlocks # PendingRecoveryBlocks.getTime(): this is there so that I can mock it for testing PendingRecoveryBlocks and I can't see a nicer solution to this, happy to hear suggestions # For testRecoveryTimeout(), I changed callRealMethod to be final but kept the name because "realMethodCalled" suggests the opposite logic # Overrode SleepAnswer.answer() instead of creating new protected SleepAnswer.callRealMethod() > Block recovery will fail indefinitely if recovery time > heartbeat interval > --- > > Key: HDFS-11576 > URL: https://issues.apache.org/jira/browse/HDFS-11576 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs, namenode >Affects Versions: 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-11576.001.patch, HDFS-11576.002.patch, > HDFS-11576.003.patch, HDFS-11576.004.patch, HDFS-11576.005.patch, > HDFS-11576.006.patch, HDFS-11576.007.patch, HDFS-11576.008.patch, > HDFS-11576.009.patch, HDFS-11576.010.patch, HDFS-11576.011.patch, > HDFS-11576.012.patch, HDFS-11576.repro.patch > > > Block recovery will fail indefinitely if the time to recover a block is > always longer than the heartbeat interval. Scenario: > 1. DN sends heartbeat > 2. NN sends a recovery command to DN, recoveryID=X > 3. DN starts recovery > 4. DN sends another heartbeat > 5. NN sends a recovery command to DN, recoveryID=X+1 > 6. DN calls commitBlockSyncronization after succeeding with first recovery to > NN, which fails because X < X+1 > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11754: -- Fix Version/s: 2.8.4 2.9.1 2.10.0 3.1.0 3.0.0 > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Fix For: 3.0.0, 3.1.0, 2.10.0, 2.9.1, 2.8.4 > > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval
[ https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Majercak updated HDFS-11576: -- Attachment: HDFS-11576.012.patch > Block recovery will fail indefinitely if recovery time > heartbeat interval > --- > > Key: HDFS-11576 > URL: https://issues.apache.org/jira/browse/HDFS-11576 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs, namenode >Affects Versions: 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Critical > Attachments: HDFS-11576.001.patch, HDFS-11576.002.patch, > HDFS-11576.003.patch, HDFS-11576.004.patch, HDFS-11576.005.patch, > HDFS-11576.006.patch, HDFS-11576.007.patch, HDFS-11576.008.patch, > HDFS-11576.009.patch, HDFS-11576.010.patch, HDFS-11576.011.patch, > HDFS-11576.012.patch, HDFS-11576.repro.patch > > > Block recovery will fail indefinitely if the time to recover a block is > always longer than the heartbeat interval. Scenario: > 1. DN sends heartbeat > 2. NN sends a recovery command to DN, recoveryID=X > 3. DN starts recovery > 4. DN sends another heartbeat > 5. NN sends a recovery command to DN, recoveryID=X+1 > 6. DN calls commitBlockSyncronization after succeeding with first recovery to > NN, which fails because X < X+1 > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271556#comment-16271556 ] Kihwal Lee commented on HDFS-11754: --- Thanks for working on this, [~erofeev]. Thanks for reviews, [~surendrasingh] and [~shahrs87]. I've committed this to trunk, branch-3.0, branch-3.0.0 (low risk), branch-2, branch-2.9 and branch-2.8. > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271545#comment-16271545 ] Kihwal Lee commented on HDFS-11754: --- +1 looks good. > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
[ https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271541#comment-16271541 ] Íñigo Goiri commented on HDFS-12681: I would ignore the left check styles. The unit test failures seem spurious. Given that locally they pass (including the MapReduce ones): +1 Good to rerun QA just to get a cleaner build though. > Make HdfsLocatedFileStatus a subtype of LocatedFileStatus > - > > Key: HDFS-12681 > URL: https://issues.apache.org/jira/browse/HDFS-12681 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chris Douglas >Assignee: Chris Douglas > Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, > HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, > HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, > HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, > HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch, > HDFS-12681.14.patch, HDFS-12681.15.patch > > > {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of > {{LocatedFileStatus}}. Conversion requires copying common fields and shedding > unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to > extend {{LocatedFileStatus}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271525#comment-16271525 ] Rushabh S Shah commented on HDFS-11754: --- bq. Can you please verify that the test failures are not related. Nevermind I ran all the failed tests locally and all of them succeeded except {{TestReadStripedFileWithMissingBlocks}}. Failure of {{TestReadStripedFileWithMissingBlocks}} is tracked by HDFS-12723. {noformat} [INFO] Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 25.154 s - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 51.121 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 83.275 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.54 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration [INFO] Running org.apache.hadoop.hdfs.server.federation.metrics.TestFederationMetrics [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.166 s - in org.apache.hadoop.hdfs.server.federation.metrics.TestFederationMetrics [INFO] Running org.apache.hadoop.hdfs.TestLeaseRecovery2 [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 69.896 s - in org.apache.hadoop.hdfs.TestLeaseRecovery2 [INFO] Running org.apache.hadoop.hdfs.TestPread [INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 123.237 s - in org.apache.hadoop.hdfs.TestPread [INFO] Running org.apache.hadoop.hdfs.TestReadStripedFileWithMissingBlocks [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 149.108 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestReadStripedFileWithMissingBlocks [ERROR] testReadFileWithMissingBlocks(org.apache.hadoop.hdfs.TestReadStripedFileWithMissingBlocks) Time elapsed: 148.948 s <<< ERROR! java.util.concurrent.TimeoutException: Timed out waiting for /foo to have all the internalBlocks at org.apache.hadoop.hdfs.StripedFileTestUtil.waitBlockGroupsReported(StripedFileTestUtil.java:295) at org.apache.hadoop.hdfs.StripedFileTestUtil.waitBlockGroupsReported(StripedFileTestUtil.java:256) at org.apache.hadoop.hdfs.TestReadStripedFileWithMissingBlocks.readFileWithMissingBlocks(TestReadStripedFileWithMissingBlocks.java:98) at org.apache.hadoop.hdfs.TestReadStripedFileWithMissingBlocks.testReadFileWithMissingBlocks(TestReadStripedFileWithMissingBlocks.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) [INFO] [INFO] Results: [INFO] [ERROR] Errors: [ERROR] TestReadStripedFileWithMissingBlocks.testReadFileWithMissingBlocks:82->readFileWithMissingBlocks:98 » Timeout [ERROR] Tests run: 48, Failures: 0, Errors: 1, Skipped: 0 {noformat} Also the patch applies to branch-2.8 also. So we should be good to get in till branch-2.8. > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this
[jira] [Updated] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11754: -- Target Version/s: 2.8.4 (was: 2.10.0) > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271503#comment-16271503 ] Rushabh S Shah commented on HDFS-11754: --- It would be nice if we can get it into 2.8.3 and above. > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11754) Make FsServerDefaults cache configurable.
[ https://issues.apache.org/jira/browse/HDFS-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271500#comment-16271500 ] Rushabh S Shah commented on HDFS-11754: --- +1 (non-binding) The changes ltgm. Thanks [~erofeev] ! Can you please verify that the test failures are not related. [~kihwal]: can you please review this simple patch ? > Make FsServerDefaults cache configurable. > - > > Key: HDFS-11754 > URL: https://issues.apache.org/jira/browse/HDFS-11754 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Rushabh S Shah >Assignee: Mikhail Erofeev >Priority: Minor > Labels: newbie > Attachments: HDFS-11754.001.patch, HDFS-11754.002.patch, > HDFS-11754.003.patch, HDFS-11754.004.patch, HDFS-11754.005.patch, > HDFS-11754.006.patch > > > DFSClient caches the result of FsServerDefaults for 60 minutes. > But the 60 minutes time is not configurable. > Continuing the discussion from HDFS-11702, it would be nice if we can make > this configurable and make the default as 60 minutes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12840) Creating a replicated file in a EC zone does not correctly serialized in EditLogs
[ https://issues.apache.org/jira/browse/HDFS-12840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-12840: - Attachment: editsStored > Creating a replicated file in a EC zone does not correctly serialized in > EditLogs > - > > Key: HDFS-12840 > URL: https://issues.apache.org/jira/browse/HDFS-12840 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12840.00.patch, HDFS-12840.01.patch, > HDFS-12840.02.patch, HDFS-12840.reprod.patch, editsStored, editsStored > > > When create a replicated file in an existing EC zone, the edit logs does not > differentiate it from an EC file. When {{FSEditLogLoader}} to replay edits, > this file is treated as EC file, as a results, it crashes the NN because the > blocks of this file are replicated, which does not match with {{INode}}. > {noformat} > ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered > exception on operation AddBlockOp [path=/system/balancer.id, > penultimateBlock=NULL, lastBlock=blk_1073743259_2455, RpcClientId=, > RpcCallId=-2] > java.lang.IllegalArgumentException: reportedBlock is not striped > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.addStorage(BlockInfoStriped.java:118) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.addBlock(DatanodeStorageInfo.java:256) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3141) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlockUnderConstruction(BlockManager.java:3068) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:3864) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessages(BlockManager.java:2916) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessagesForBlock(BlockManager.java:2903) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.addNewBlock(FSEditLogLoader.java:1069) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:532) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log
[ https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271353#comment-16271353 ] Wei-Chiu Chuang commented on HDFS-12836: Fix and test looks good to me. Would you please update the patch to fix the checkstyle warning? +1 after that. > startTxId could be greater than endTxId when tailing in-progress edit log > - > > Key: HDFS-12836 > URL: https://issues.apache.org/jira/browse/HDFS-12836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.0-alpha1 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HDFS-12836.1.patch, HDFS-12836.2.patch > > > When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also > tail those in progress edit log segments. However, in the following code: > {code} > if (onlyDurableTxns && inProgressOk) { > endTxId = Math.min(endTxId, committedTxnId); > } > EditLogInputStream elis = EditLogFileInputStream.fromUrl( > connectionFactory, url, remoteLog.getStartTxId(), > endTxId, remoteLog.isInProgress()); > {code} > it is possible that {{remoteLog.getStartTxId()}} could be greater than > {{endTxId}}, and therefore will cause the following error: > {code} > 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: > Error replaying edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: > got premature end-of-file at txid 86; expected file to go up to 85 > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) > 2017-11-17 19:55:41,165 WARN > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading > edits from disk. Will try again. > org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying > edit log at offset 1048576. Expected transaction ID was 87 > Recent opcode offsets: 1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) > Caused by: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: > got premature end-of-file at txid 86;
[jira] [Updated] (HDFS-12835) RBF: Fix Javadoc parameter errors
[ https://issues.apache.org/jira/browse/HDFS-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HDFS-12835: --- Resolution: Fixed Status: Resolved (was: Patch Available) > RBF: Fix Javadoc parameter errors > - > > Key: HDFS-12835 > URL: https://issues.apache.org/jira/browse/HDFS-12835 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.9.0, 3.0.0 >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Minor > Labels: RBF > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-12835.000.patch, HDFS-12835.001.patch > > > Fix the javadoc errors in Router-based federation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12835) RBF: Fix Javadoc parameter errors
[ https://issues.apache.org/jira/browse/HDFS-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271293#comment-16271293 ] Wei Yan commented on HDFS-12835: Thanks [~elgoiri] for the review. Committed to trunk, branch-3.0, branch-2 and branch-2.9. > RBF: Fix Javadoc parameter errors > - > > Key: HDFS-12835 > URL: https://issues.apache.org/jira/browse/HDFS-12835 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.9.0, 3.0.0 >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Minor > Labels: RBF > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-12835.000.patch, HDFS-12835.001.patch > > > Fix the javadoc errors in Router-based federation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12808) Add LOG.isDebugEnabled() guard for LOG.debug("...")
[ https://issues.apache.org/jira/browse/HDFS-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271287#comment-16271287 ] Bharat Viswanadham commented on HDFS-12808: --- [~busbey] Is it fine to address only this in this jira, for rest I have created HDFS-12829. > Add LOG.isDebugEnabled() guard for LOG.debug("...") > --- > > Key: HDFS-12808 > URL: https://issues.apache.org/jira/browse/HDFS-12808 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Mehran Hassani >Assignee: Bharat Viswanadham >Priority: Minor > Attachments: HDFS-12808.00.patch, HDFS-12808.01.patch > > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. In this > file, there is a debug level logging statement containing multiple string > concatenation without the if statement before them: > hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java, > LOG.debug("got fadvise(offset=" + offset + ", len=" + len +",flags=" + flags > + ")");, 82 > Would you be interested in adding the if, to the logging statement? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12835) RBF: Fix Javadoc parameter errors
[ https://issues.apache.org/jira/browse/HDFS-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HDFS-12835: --- Target Version/s: 3.1.0, 2.9.1 Hadoop Flags: Reviewed Fix Version/s: 3.0.1 2.9.1 2.10.0 3.1.0 > RBF: Fix Javadoc parameter errors > - > > Key: HDFS-12835 > URL: https://issues.apache.org/jira/browse/HDFS-12835 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.9.0, 3.0.0 >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Minor > Labels: RBF > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-12835.000.patch, HDFS-12835.001.patch > > > Fix the javadoc errors in Router-based federation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11751) DFSZKFailoverController daemon exits with wrong status code
[ https://issues.apache.org/jira/browse/HDFS-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271280#comment-16271280 ] Bharat Viswanadham edited comment on HDFS-11751 at 11/29/17 6:19 PM: - [~brahmareddy] could you please help review and committing the patch. was (Author: bharatviswa): [~brahma] could you please help review and committing the patch. > DFSZKFailoverController daemon exits with wrong status code > --- > > Key: HDFS-11751 > URL: https://issues.apache.org/jira/browse/HDFS-11751 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 3.0.0-alpha2 >Reporter: Doris Gu >Assignee: Bharat Viswanadham > Attachments: HDFS-11751.001.patch, HDFS-11751.02.patch > > > 1.use *hdfs zkfc* to start a zkfc daemon; > 2.zkfc failed to start, but we got the successful code. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12834) DFSZKFailoverController on error exits with 0 error code
[ https://issues.apache.org/jira/browse/HDFS-12834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271282#comment-16271282 ] Bharat Viswanadham commented on HDFS-12834: --- [~brahmareddy] I have addressed review comments and test case failures are not related to this patch. > DFSZKFailoverController on error exits with 0 error code > > > Key: HDFS-12834 > URL: https://issues.apache.org/jira/browse/HDFS-12834 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.7.3, 3.0.0-alpha4 >Reporter: Zbigniew Kostrzewa >Assignee: Bharat Viswanadham > Attachments: HDFS-12834.00.patch, HDFS-12834.01.patch > > > On error {{DFSZKFailoverController}} exits with 0 return code which leads to > problems when integrating it with scripts and monitoring tools, e.g. systemd, > which when configured to restart the service only on failure does not restart > ZKFC because it exited with 0. > For example, in my case, systemd reported zkfc exited with success but in > logs I have found this: > {noformat} > 2017-11-14 05:33:55,075 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 3334ms for sessionid > 0x15fb794bd240001, closing socket connection and attempting reconnect > 2017-11-14 05:33:55,178 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session disconnected. Entering neutral mode... > 2017-11-14 05:33:55,564 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.73/10.9.4.73:2182. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:55,566 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.73/10.9.4.73:2182, initiating session > 2017-11-14 05:33:55,569 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server 10.9.4.73/10.9.4.73:2182, sessionid = > 0x15fb794bd240001, negotiated timeout = 5000 > 2017-11-14 05:33:55,570 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session connected. > 2017-11-14 05:33:58,230 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:33:58,335 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session disconnected. Entering neutral mode... > 2017-11-14 05:33:58,402 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.138/10.9.4.138:2181. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:58,403 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.138/10.9.4.138:2181, initiating session > 2017-11-14 05:33:58,406 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:33:59,218 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.228/10.9.4.228:2183. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:33:59,219 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to 10.9.4.228/10.9.4.228:2183, initiating session > 2017-11-14 05:33:59,221 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x15fb794bd240001, likely server has > closed socket, closing socket connection and attempting reconnect > 2017-11-14 05:34:01,094 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server 10.9.4.73/10.9.4.73:2182. Will not attempt to > authenticate using SASL (unknown error) > 2017-11-14 05:34:01,094 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 1773ms for sessionid > 0x15fb794bd240001, closing socket connection and attempting reconnect > 2017-11-14 05:34:01,196 FATAL org.apache.hadoop.ha.ActiveStandbyElector: > Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further > znode monitoring connection errors. > 2017-11-14 05:34:02,153 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x15fb794bd240001 closed > 2017-11-14 05:34:02,154 FATAL org.apache.hadoop.ha.ZKFailoverController: > Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. > Not retrying further znode monitoring connection errors. > 2017-11-14 05:34:02,154 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2017-11-14 05:34:05,208 INFO org.apache.hadoop.ipc.Server: Stopping server on > 8019 > 2017-11-14 05:34:05,487 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 8019 > 2017-11-14 05:34:05,488 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server Responder >
[jira] [Commented] (HDFS-11751) DFSZKFailoverController daemon exits with wrong status code
[ https://issues.apache.org/jira/browse/HDFS-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271280#comment-16271280 ] Bharat Viswanadham commented on HDFS-11751: --- [~brahma] could you please help review and committing the patch. > DFSZKFailoverController daemon exits with wrong status code > --- > > Key: HDFS-11751 > URL: https://issues.apache.org/jira/browse/HDFS-11751 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 3.0.0-alpha2 >Reporter: Doris Gu >Assignee: Bharat Viswanadham > Attachments: HDFS-11751.001.patch, HDFS-11751.02.patch > > > 1.use *hdfs zkfc* to start a zkfc daemon; > 2.zkfc failed to start, but we got the successful code. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12835) RBF: Fix Javadoc parameter errors
[ https://issues.apache.org/jira/browse/HDFS-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271273#comment-16271273 ] Hudson commented on HDFS-12835: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13290 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13290/]) HDFS-12835. Fix the javadoc errors in Router-based federation. (weiy: rev 301641811d93ac22dc6fe1a05f18c1f266cc5e54) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/NamenodeStatusReport.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/FederationUtil.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreDriver.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MembershipNamenodeResolver.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileSystemImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreZooKeeperImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/RecordStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/ActiveNamenodeResolver.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/CachedRecordStore.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/ConnectionPool.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/NamenodeHeartbeatService.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/ConnectionManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/Router.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MountTableResolver.java > RBF: Fix Javadoc parameter errors > - > > Key: HDFS-12835 > URL: https://issues.apache.org/jira/browse/HDFS-12835 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.9.0, 3.0.0 >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Minor > Labels: RBF > Attachments: HDFS-12835.000.patch, HDFS-12835.001.patch > > > Fix the javadoc errors in Router-based federation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12763) DataStreamer should heartbeat during flush
[ https://issues.apache.org/jira/browse/HDFS-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271159#comment-16271159 ] Kuhu Shukla commented on HDFS-12763: Would be nice to get some preliminary comments from the community on this. Thanks a lot! CC: [~kihwal]. > DataStreamer should heartbeat during flush > -- > > Key: HDFS-12763 > URL: https://issues.apache.org/jira/browse/HDFS-12763 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: HDFS-12763.001.patch > > > From HDFS-5032: > bq. Absence of heartbeat during flush will be fixed in a separate jira by > Daryn Sharp > This JIRA tracks the case where absence of heartbeat can cause the pipeline > to fail if operations like flush take some time to complete. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12866) Recursive delete of a large directory or snapshot makes namenode unresponsive
[ https://issues.apache.org/jira/browse/HDFS-12866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270782#comment-16270782 ] Kihwal Lee commented on HDFS-12866: --- bq. 1.1. hold the fsn write lock, disconnect the target to be deleted from its parent dir, release the lock Many have thought about doing this. The snapshot makes it very hard. > Recursive delete of a large directory or snapshot makes namenode unresponsive > - > > Key: HDFS-12866 > URL: https://issues.apache.org/jira/browse/HDFS-12866 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang > > Currently file/directory deletion happens in two steps (see > {{FSNamesystem#delete(String src, boolean recursive, boolean logRetryCache)}}: > # Do the following under fsn write lock and release the lock afterwards > ** 1.1 recursively traverse the target, collect INodes and all blocks to be > deleted > ** 1.2 delete all INodes > # Delete the blocks to be deleted incrementally, chunk by chunk. That is, in > a loop, do: > ** acquire fsn write lock, > ** delete chunk of blocks > ** release fsn write lock > Breaking the deletion to two steps is to not hold the fsn write lock for too > long thus making NN not responsive. However, even with this, for deleting > large directory, or deleting snapshot that has a lot of contents, step 1 > itself would takes long time thus still hold the fsn write lock for too long > and make NN not responsive. > A possible solution would be to add one more sub step in step 1, and only > hold fsn write lock in sub step 1.1: > * 1.1. hold the fsn write lock, disconnect the target to be deleted from its > parent dir, release the lock > * 1.2 recursively traverse the target, collect INodes and all blocks to be > deleted > * 1.3 delete all INodes > Then do step 2. > This means, any operations on any file/dir need to check if its ancestor is > deleted (ancestor is disconnected), similar to what's done in > FSNamesystem#isFileDeleted method. > I'm throwing the thought here for further discussion. Welcome comments and > inputs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12871) Ozone: Service Discovery: Expose list of Datanodes from SCM
Nanda kumar created HDFS-12871: -- Summary: Ozone: Service Discovery: Expose list of Datanodes from SCM Key: HDFS-12871 URL: https://issues.apache.org/jira/browse/HDFS-12871 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Nanda kumar Assignee: Nanda kumar A new RPC call in SCM to return the list of Datanodes in the cluster, this will be used by KSM to construct service list which is required for answering {{getServiceList}} call. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12870) Ozone: Service Discovery: REST endpoint in KSM for getServiceList
Nanda kumar created HDFS-12870: -- Summary: Ozone: Service Discovery: REST endpoint in KSM for getServiceList Key: HDFS-12870 URL: https://issues.apache.org/jira/browse/HDFS-12870 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Nanda kumar Assignee: Nanda kumar A new REST call to be added in KSM which will return the list of Services that are there in Ozone cluster, this will be used by OzoneClient for establishing the connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12869) Ozone: Service Discovery: RPC endpoint in KSM for getServiceList
Nanda kumar created HDFS-12869: -- Summary: Ozone: Service Discovery: RPC endpoint in KSM for getServiceList Key: HDFS-12869 URL: https://issues.apache.org/jira/browse/HDFS-12869 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Nanda kumar Assignee: Nanda kumar A new RPC call to be added to KSM which will return the list of Services that are there in Ozone cluster, this will be used by OzoneClient for establishing the connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12868) Ozone: Service Discovery API
Nanda kumar created HDFS-12868: -- Summary: Ozone: Service Discovery API Key: HDFS-12868 URL: https://issues.apache.org/jira/browse/HDFS-12868 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Nanda kumar Assignee: Nanda kumar Currently if a client wants to connect to Ozone cluster we need multiple properties to be configured in the client. For RPC based connection we need {{ozone.ksm.address}} {{ozone.scm.client.address}} and the ports if something other than default is configured. For REST based connection {{ozone.rest.servers}} and port if something other than default is configured. With the introduction of Service Discovery API the client should be able to discover all the configurations needed for the connection. Service discovery calls will be handled by KSM, at the client side, we only need to configure {{ozone.ksm.address}}. The client should first connect to KSM and get all the required configurations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12867) Ozone: TestOzoneConfigurationFields fails consistently
[ https://issues.apache.org/jira/browse/HDFS-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270703#comment-16270703 ] genericqa commented on HDFS-12867: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 37s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}140m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer | | | hadoop.hdfs.TestDistributedFileSystemWithECFileWithRandomECPolicy | | | hadoop.hdfs.TestDecommissionWithStriped | | | hadoop.fs.TestUnbuffer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.ozone.web.client.TestKeys | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.cblock.TestBufferManager | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | | | hadoop.ozone.client.rpc.TestOzoneRpcClient | | | hadoop.hdfs.TestUnsetAndChangeDirectoryEcPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 | | | hadoop.ozone.web.client.TestKeysRatis | | | hadoop.hdfs.TestParallelUnixDomainRead | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.cblock.TestCBlockReadWrite | | | hadoop.hdfs.TestDistributedFileSystemWithECFile | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d11161b | | JIRA Issue | HDFS-12867 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12899789/HDFS-12867-HDFS-7240.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 9a3c4c453da0 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build
[jira] [Commented] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270653#comment-16270653 ] genericqa commented on HDFS-12859: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 667 unchanged - 4 fixed = 667 total (was 671) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}140m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}186m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicy | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.blockmanagement.TestSequentialBlockGroupId | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation | | | hadoop.hdfs.server.datanode.TestBlockScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12859 | | JIRA Patch URL |
[jira] [Updated] (HDFS-12867) Ozone: TestOzoneConfigurationFields fails consistently
[ https://issues.apache.org/jira/browse/HDFS-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-12867: - Attachment: HDFS-12867-HDFS-7240.002.patch > Ozone: TestOzoneConfigurationFields fails consistently > -- > > Key: HDFS-12867 > URL: https://issues.apache.org/jira/browse/HDFS-12867 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone, test >Affects Versions: HDFS-7240 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-12867-HDFS-7240.001.patch, > HDFS-12867-HDFS-7240.002.patch > > > The unit test TestOzoneConfigurationFields fails consistently because of 2 > config entries missing in ozone-default file. The stack trace: > {noformat} > java.lang.AssertionError: class org.apache.hadoop.ozone.OzoneConfigKeys class > org.apache.hadoop.scm.ScmConfigKeys class > org.apache.hadoop.ozone.ksm.KSMConfigKeys class > org.apache.hadoop.cblock.CBlockConfigKeys has 2 variables missing in > ozone-default.xml Entries: ozone.rest.client.port ozone.rest.servers > expected:<0> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.conf.TestConfigurationFieldsBase.testCompareConfigurationClassAgainstXml(TestConfigurationFieldsBase.java:493) > {noformat} > The config {{ozone.rest.client.port}}, {{ozone.rest.servers}} were introduced > in HDFS-12549 but missing to documented. This leads the failure. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12867) Ozone: TestOzoneConfigurationFields fails consistently
[ https://issues.apache.org/jira/browse/HDFS-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270503#comment-16270503 ] Yiqun Lin commented on HDFS-12867: -- Thanks for the quick reviews, [~anu] and [~nandakumar131]. Attach the updated patch for addressing comment. > Ozone: TestOzoneConfigurationFields fails consistently > -- > > Key: HDFS-12867 > URL: https://issues.apache.org/jira/browse/HDFS-12867 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone, test >Affects Versions: HDFS-7240 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-12867-HDFS-7240.001.patch > > > The unit test TestOzoneConfigurationFields fails consistently because of 2 > config entries missing in ozone-default file. The stack trace: > {noformat} > java.lang.AssertionError: class org.apache.hadoop.ozone.OzoneConfigKeys class > org.apache.hadoop.scm.ScmConfigKeys class > org.apache.hadoop.ozone.ksm.KSMConfigKeys class > org.apache.hadoop.cblock.CBlockConfigKeys has 2 variables missing in > ozone-default.xml Entries: ozone.rest.client.port ozone.rest.servers > expected:<0> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.conf.TestConfigurationFieldsBase.testCompareConfigurationClassAgainstXml(TestConfigurationFieldsBase.java:493) > {noformat} > The config {{ozone.rest.client.port}}, {{ozone.rest.servers}} were introduced > in HDFS-12549 but missing to documented. This leads the failure. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12867) Ozone: TestOzoneConfigurationFields fails consistently
[ https://issues.apache.org/jira/browse/HDFS-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270475#comment-16270475 ] genericqa commented on HDFS-12867: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} HDFS-7240 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 0s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 29m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} HDFS-7240 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 40s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}132m 56s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}181m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.fs.TestUnbuffer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.ozone.web.client.TestKeys | | | hadoop.ozone.scm.TestAllocateContainer | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.cblock.TestBufferManager | | | hadoop.ozone.client.rpc.TestOzoneRpcClient | | | hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer | | | hadoop.ozone.web.client.TestKeysRatis | | | hadoop.hdfs.server.balancer.TestBalancerRPCDelay | | | hadoop.cblock.TestCBlockReadWrite | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d11161b | | JIRA Issue | HDFS-12867 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12899759/HDFS-12867-HDFS-7240.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 8dd046cf377d 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-7240 / ac9cc8a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/22217/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results |
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Release Note: Fix due to the unit test error. (was: Add command "hdfs dfsadmin -resetBalancerBandwidth" ) Status: Patch Available (was: Open) > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, > 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Attachment: 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, > 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Attachment: 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Attachment: (was: 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch) > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Attachment: (was: 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch) > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Status: Open (was: Patch Available) > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, > 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12859) Admin command resetBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianfei Jiang updated HDFS-12859: - Attachment: 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch > Admin command resetBalancerBandwidth > > > Key: HDFS-12859 > URL: https://issues.apache.org/jira/browse/HDFS-12859 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Jianfei Jiang > Fix For: 3.1.0 > > Attachments: > 0003-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, > 0004-HDFS-12859-Admin-command-resetBalancerBandwidth.patch, HDFS-12859.patch > > > We can already set balancer bandwidth dynamically using command > setBalancerBandwidth. The setting value is not persistent and not stored in > configuration file. The different datanodes could their different default or > former setting in configuration. > When we suggested to develop a schedule balancer task which runs at midnight > everyday. We set a larger bandwidth for it and hope to reset the value after > finishing. However, we found it difficult to reset the different setting for > different datanodes as the setBalancerBandwidth command can only set the same > value to all datanodes. If we want to use unique setting for every datanode, > we have to reset the datanodes. > So it would be useful to have a command to synchronize the setting with the > configuration file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12867) Ozone: TestOzoneConfigurationFields fails consistently
[ https://issues.apache.org/jira/browse/HDFS-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270387#comment-16270387 ] Nanda kumar commented on HDFS-12867: Thanks [~linyiqun] for reporting and working on this. One minor comment, can we elaborate the description a bit: {{ozone.rest.servers}} -> {{The REST server hostnames to connect, comma separated list of host (typically datanodes) where Ozone REST handler are running.}} {{ozone.rest.client.port}} -> {{Port used by client to connect to Ozone REST server. When a datanode is configured to run Ozone REST handler this port typically points to datanode info port.}} > Ozone: TestOzoneConfigurationFields fails consistently > -- > > Key: HDFS-12867 > URL: https://issues.apache.org/jira/browse/HDFS-12867 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone, test >Affects Versions: HDFS-7240 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-12867-HDFS-7240.001.patch > > > The unit test TestOzoneConfigurationFields fails consistently because of 2 > config entries missing in ozone-default file. The stack trace: > {noformat} > java.lang.AssertionError: class org.apache.hadoop.ozone.OzoneConfigKeys class > org.apache.hadoop.scm.ScmConfigKeys class > org.apache.hadoop.ozone.ksm.KSMConfigKeys class > org.apache.hadoop.cblock.CBlockConfigKeys has 2 variables missing in > ozone-default.xml Entries: ozone.rest.client.port ozone.rest.servers > expected:<0> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.conf.TestConfigurationFieldsBase.testCompareConfigurationClassAgainstXml(TestConfigurationFieldsBase.java:493) > {noformat} > The config {{ozone.rest.client.port}}, {{ozone.rest.servers}} were introduced > in HDFS-12549 but missing to documented. This leads the failure. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org