[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10381: -- Attachment: HDFS-10381.001.patch Please review patch 001: * Change 2 logs from info to warn in {{DataStreamer.nextBlockOutputStream}} * Change 1 log from info to warn in {{StripedDataStreamer.nextBlockOutputStream}} Couldn't find any suitable unit test to change or run. > DataStreamer DataNode exclusion log message should be warning > - > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10381.001.patch > > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10381: -- Status: Patch Available (was: In Progress) > DataStreamer DataNode exclusion log message should be warning > - > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-10381.001.patch > > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9559) Add haadmin command to get HA state of all the namenodes
[ https://issues.apache.org/jira/browse/HDFS-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277653#comment-15277653 ] Surendra Singh Lilhore commented on HDFS-9559: -- [~eddyxu] Please can you review ? > Add haadmin command to get HA state of all the namenodes > > > Key: HDFS-9559 > URL: https://issues.apache.org/jira/browse/HDFS-9559 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9559.01.patch > > > Currently we have one command to get state of namenode. > {code} > ./hdfs haadmin -getServiceState > {code} > It will be good to have command which will give state of all the namenodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10303) DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.
[ https://issues.apache.org/jira/browse/HDFS-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277650#comment-15277650 ] Surendra Singh Lilhore commented on HDFS-10303: --- Checkstyle warning and Failed test case is unrelated.. Please review.. bq. org.apache.hadoop.hdfs.TestHFlush.testHFlushInterrupted This test case fixed in HDFS-2043. > DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly. > - > > Key: HDFS-10303 > URL: https://issues.apache.org/jira/browse/HDFS-10303 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-10303-001.patch > > > Packets acknowledge duration should be calculated based on the packet send > time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277640#comment-15277640 ] Xiaobing Zhou commented on HDFS-10346: -- Thank you for review. I posted patch v004. It 1. called fixRelativePart and added doc for return value. 2. moved test preparation out of async call test. 3. added two tests, testConservativeConcurrentAsyncAPI and testAggressiveConcurrentAsyncAPI, which limit max #async calls to 100 and unlimited (e.g. 1, large enough so as not to be hit) 4. removed testPermissionChecking which did similar thing with testAggressiveConcurrentAsyncAPI in patch v003 5. added rename test in the cases of testConservativeConcurrentAsyncAPI and testAggressiveConcurrentAsyncAPI, but assumed specific order/dependency of rename, setPermission and setOwner, otherwise, the test needs to deal with a swamp of non-deterministic results due to dependency of async calls. > Implement asynchronous setPermission/setOwner for DistributedFileSystem > --- > > Key: HDFS-10346 > URL: https://issues.apache.org/jira/browse/HDFS-10346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, hdfs-client >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10346-HDFS-9924.000.patch, > HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch, > HDFS-10346-HDFS-9924.004.patch > > > This is proposed to implement an asynchronous setPermission and setOwner. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10346: - Attachment: HDFS-10346-HDFS-9924.004.patch > Implement asynchronous setPermission/setOwner for DistributedFileSystem > --- > > Key: HDFS-10346 > URL: https://issues.apache.org/jira/browse/HDFS-10346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, hdfs-client >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10346-HDFS-9924.000.patch, > HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch, > HDFS-10346-HDFS-9924.004.patch > > > This is proposed to implement an asynchronous setPermission and setOwner. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277626#comment-15277626 ] John Zhuge commented on HDFS-10381: --- Thanks [~liuml07]. Fixed. > DataStreamer DataNode exclusion log message should be warning > - > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10381: -- Component/s: (was: datanode) hdfs-client > DataStreamer DataNode exclusion log message should be warning > - > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10381) DataStreamer DataNode exclusion log message should be warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-10381: -- Summary: DataStreamer DataNode exclusion log message should be warning (was: DataNode exclusion log message should be a warning) > DataStreamer DataNode exclusion log message should be warning > - > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277597#comment-15277597 ] John Zhuge commented on HDFS-10287: --- Thanks [~boky01]. Leave the unit test alone to minimize noise for this patch. We can always file a separate jira to clean it up. The unit test failures seem unrelated. +1 LGTM. > MiniDFSCluster should implement AutoCloseable > - > > Key: HDFS-10287 > URL: https://issues.apache.org/jira/browse/HDFS-10287 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: John Zhuge >Assignee: Andras Bokor >Priority: Trivial > Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch > > > {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support > [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]. > It will make test code a little cleaner and more reliable. > Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be > backported to Hadoop version prior to 2.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker
[ https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-8449: Status: Patch Available (was: In Progress) > Add tasks count metrics to datanode for ECWorker > > > Key: HDFS-8449 > URL: https://issues.apache.org/jira/browse/HDFS-8449 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, > HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, > HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, > HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, > HDFS-8449-v10.patch, HDFS-8449-v11.patch > > > This sub task try to record ec recovery tasks that a datanode has done, > including total tasks, failed tasks and sucessful tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker
[ https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-8449: Status: In Progress (was: Patch Available) To trigger the test > Add tasks count metrics to datanode for ECWorker > > > Key: HDFS-8449 > URL: https://issues.apache.org/jira/browse/HDFS-8449 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, > HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, > HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, > HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, > HDFS-8449-v10.patch, HDFS-8449-v11.patch > > > This sub task try to record ec recovery tasks that a datanode has done, > including total tasks, failed tasks and sucessful tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10383) Safely close resources in DFSTestUtil
Mingliang Liu created HDFS-10383: Summary: Safely close resources in DFSTestUtil Key: HDFS-10383 URL: https://issues.apache.org/jira/browse/HDFS-10383 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Mingliang Liu Assignee: Mingliang Liu Priority: Minor There are a few of methods in {{DFSTestUtil}} that do not close the resource safely, or elegantly. We can use the try-with-resource statement to address this problem. Specially, as {{DFSTestUtil}} is popularly used in test, we need to preserve any exceptions thrown during the processing of the resource while still guaranteeing it's closed finally. Take for example,the current implementation of {{DFSTestUtil#createFile()}} closes the FSDataOutputStream in the {{finally}} block, and when closing if the internal {{DFSOutputStream#close()}} throws any exception, which it often does, the exception thrown during the processing will be lost. See this [test failure|https://builds.apache.org/job/PreCommit-HADOOP-Build/9320/testReport/org.apache.hadoop.hdfs/TestAsyncDFSRename/testAggressiveConcurrentAsyncRenameWithOverwrite/], and we have to guess what was the root cause. Using try-with-resource, we can close the resources safely, and the exceptions thrown both in processing and closing will be available (closing exception will be suppressed). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode
[ https://issues.apache.org/jira/browse/HDFS-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramtin updated HDFS-10382: -- Status: Patch Available (was: Open) > In WebHDFS numeric usernames do not work with DataNode > -- > > Key: HDFS-10382 > URL: https://issues.apache.org/jira/browse/HDFS-10382 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: ramtin >Assignee: ramtin > > Operations like {code:java}curl -i > -L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that > directed to DataNode fail because of not reading the suggested domain pattern > from the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode
[ https://issues.apache.org/jira/browse/HDFS-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277390#comment-15277390 ] ASF GitHub Bot commented on HDFS-10382: --- GitHub user ramtinb opened a pull request: https://github.com/apache/hadoop/pull/94 HDFS-10382 In WebHDFS numeric usernames do not work with DataNode In WebHDFS for cat operation, we have 2 sequential HTTP requests. The first HTTP request is handled by NN and the second one by DN. Unlike the NN, the DN is not using the suggested domain pattern from the configuration! You can merge this pull request into a Git repository by running: $ git pull https://github.com/ramtinb/hadoop feature/HDFS-10382 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/94.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #94 commit 6cbb8f60702b8c3211e52e71681fc8d981fe0525 Author: Ramtin Boustani Date: 2016-05-10T00:05:05Z HDFS-10382 In WebHDFS numeric usernames do not work with DataNode > In WebHDFS numeric usernames do not work with DataNode > -- > > Key: HDFS-10382 > URL: https://issues.apache.org/jira/browse/HDFS-10382 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: ramtin >Assignee: ramtin > > Operations like {code:java}curl -i > -L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that > directed to DataNode fail because of not reading the suggested domain pattern > from the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277292#comment-15277292 ] Hadoop QA commented on HDFS-10360: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 11s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 55s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s {color} | {color:red} root: patch generated 6 new + 142 unchanged - 1 fixed = 148 total (was 143) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 17s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m 20s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 33s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 34s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 40s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277267#comment-15277267 ] Hudson commented on HDFS-10372: --- FAILURE: Integrated in Hadoop-trunk-Commit #9737 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9737/]) HDFS-10372. Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume. (kihwal: rev b9e5a32fa14b727b44118ec7f43fb95de05a7c2c) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.7.3 > > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277254#comment-15277254 ] Hadoop QA commented on HDFS-10287: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 new + 208 unchanged - 3 fixed = 208 total (was 211) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 6s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 46s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 145m 32s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestFileCreationDelete | | | hadoop.hdfs.TestCrcCorruption | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestFileCreationDelete | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803063/HDFS-10287.02.patch | | JIRA Issue | HDFS-10287 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 8c6cd1036e86 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT W
[jira] [Comment Edited] (HDFS-10378) setOwner throws AccessControlException with wrong message
[ https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276994#comment-15276994 ] Yongjun Zhang edited comment on HDFS-10378 at 5/9/16 10:41 PM: --- HI [~jzhuge], Thanks for reporting the issue and the patch. Couple of comments: 1. would you please provide single patch file that contains both the fix and unit test, instead of multiple files? 2. {{setOwner}} expects user to be superuser or belonging to superuser group. The call to {{fsd.checkOwner(pc, iip);}} appears to be no-op for this kind of user. Would you please look into whether we really need the call {{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser both necessary and sufficient condition for this method? Thanks. was (Author: yzhangal): HI [~jzhuge], Thanks for reporting the issue and the patch. Couple of comments: 1. would you please provide single patch file that contains both the fix and unit test, instead of multiple files? 2. {{setOwner}} expects user to be superuser or belonging to superuser group. The call to {{fsd.checkOwner(pc, iip);}} appears to be non-op for this kind of user. Would you please look into whether we really need the call {{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser both necessary and sufficient condition for this method? Thanks. > setOwner throws AccessControlException with wrong message > - > > Key: HDFS-10378 > URL: https://issues.apache.org/jira/browse/HDFS-10378 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch > > > Calling {{setOwner}} as a non-super user does trigger > {{AccessControlException}}, however, the message "Permission denied. > user=user1967821757 is not the owner of inode=child" is wrong. Expect this > message: "Non-super user cannot change owner". > Output of patched unit test {{TestPermission.testFilePermission}}: > {noformat} > 2016-05-06 16:45:44,915 [main] INFO security.TestPermission > (TestPermission.java:testFilePermission(280)) - GOOD: got > org.apache.hadoop.security.AccessControlException: Permission denied. > user=user1967821757 is not the owner of inode=child1 > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595) > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417) > {noformat} > Will upload the unit test patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10382) In WebHDFS numeric usernames do not work with DataNode
ramtin created HDFS-10382: - Summary: In WebHDFS numeric usernames do not work with DataNode Key: HDFS-10382 URL: https://issues.apache.org/jira/browse/HDFS-10382 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: ramtin Assignee: ramtin Operations like {code:java}curl -i -L"http://:/webhdfs/v1/?user.name=0123&op=OPEN"{code} that directed to DataNode fail because of not reading the suggested domain pattern from the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-10360: --- Attachment: HDFS-10360.003.patch Forgot to rebase my patch... submitted a rebased patch. > DataNode may format directory and lose blocks if current/VERSION is missing > --- > > Key: HDFS-10360 > URL: https://issues.apache.org/jira/browse/HDFS-10360 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10360.001.patch, HDFS-10360.002.patch, > HDFS-10360.003.patch > > > Under certain circumstances, if the current/VERSION of a storage directory is > missing, DataNode may format the storage directory even though _block files > are not missing_. > This is very easy to reproduce. Simply launch a HDFS cluster and create some > files. Delete current/VERSION, and restart the data node. > After the restart, the data node will format the directory and remove all > existing block files: > {noformat} > 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: > Lock on /data/dfs/dn/in_use.lock acquired by nodename > 5...@weichiu-dn-2.vpc.cloudera.com > 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: > Storage directory /data/dfs/dn is not formatted for > BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting ... > 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: > Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: > Locking is disabled for > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Block pool storage directory > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted > for BP-787466439-172 > .26.24.43-1462305406642 > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting ... > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current > {noformat} > The bug is: DataNode assumes that if none of {{current/VERSION}}, > {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and > {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing > important to HDFS and decides to format it. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545 > However, block files may still exist, and in my opinion, we should do > everything possible to retain the block files. > I have two suggestions: > # check if {{current/}} directory is empty. If not, throw an > InconsistentFSStateException in {{Storage#analyzeStorage}} instead of > asumming its not formatted. Or, > # In {{Storage#clearDirectory}}, before it formats the storage directory, > rename or move {{current/}} directory. Also, log whatever is being > renamed/moved. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277091#comment-15277091 ] Zhe Zhang commented on HDFS-9924: - Echoing Andrew that a design doc should be added. In particular, does the scope include read/write? Or only metadata operations as shown in current subtask list? Thanks. > [umbrella] Asynchronous HDFS Access > --- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > > This is an umbrella JIRA for supporting Asynchronous HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support asynchronous calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10378) setOwner throws AccessControlException with wrong message
[ https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277080#comment-15277080 ] John Zhuge commented on HDFS-10378: --- Thanks [~yzhangal]. HDFS-10378.001.patch does include the unit test in HDFS-10378-unit.patch. I will look into No 2. > setOwner throws AccessControlException with wrong message > - > > Key: HDFS-10378 > URL: https://issues.apache.org/jira/browse/HDFS-10378 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch > > > Calling {{setOwner}} as a non-super user does trigger > {{AccessControlException}}, however, the message "Permission denied. > user=user1967821757 is not the owner of inode=child" is wrong. Expect this > message: "Non-super user cannot change owner". > Output of patched unit test {{TestPermission.testFilePermission}}: > {noformat} > 2016-05-06 16:45:44,915 [main] INFO security.TestPermission > (TestPermission.java:testFilePermission(280)) - GOOD: got > org.apache.hadoop.security.AccessControlException: Permission denied. > user=user1967821757 is not the owner of inode=child1 > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595) > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417) > {noformat} > Will upload the unit test patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type
[ https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-9902: Release Note: Reserved space can be configured independently for different storage types for clusters with heterogeneous storage. The 'dfs.datanode.du.reserved' property name can be suffixed with a storage types (i.e. one of ssd, disk, archival or ram_disk). e.g. reserved space for RAM_DISK storage can be configured using the property 'dfs.datanode.du.reserved.ram_disk'. If specific storage type reservation is not configured then the value specified by 'dfs.datanode.du.reserved' will be used for all volumes. > Support different values of dfs.datanode.du.reserved per storage type > - > > Key: HDFS-9902 > URL: https://issues.apache.org/jira/browse/HDFS-9902 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Pan Yuxuan >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, > HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch > > > Now Hadoop support different storage type for DISK, SSD, ARCHIVE and > RAM_DISK, but they share one configuration dfs.datanode.du.reserved. > The DISK size may be several TB and the RAM_DISK size may be only several > tens of GB. > The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same > DN, and I set dfs.datanode.du.reserved values 10GB, this will waste a lot of > RAM_DISK size. > Since the usage of RAM_DISK can be 100%, so I don't want > dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs. > So can we make a new configuration for RAM_DISK or just skip this > configuration for RAM_DISK? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277047#comment-15277047 ] Tsz Wo Nicholas Sze commented on HDFS-9924: --- > Can we do this work on a branch? ... I think we don't need a branch for the moment since this feature does not affect the other components. This feature mainly adds new code but not changing the existing code much. > Could someone post a design doc with the motivations, proposed API, and > discussion? ... Motivations can be found in the description in this JIRA. For proposed API, it makes more sense to discuss it in HADOOP-12910. We could post some design doc, if it helps the discussion. Thanks! > [umbrella] Asynchronous HDFS Access > --- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > > This is an umbrella JIRA for supporting Asynchronous HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support asynchronous calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10346) Implement asynchronous setPermission/setOwner for DistributedFileSystem
[ https://issues.apache.org/jira/browse/HDFS-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277031#comment-15277031 ] Tsz Wo Nicholas Sze commented on HDFS-10346: Thanks Xiaobing! Some comment ont he patch. - In AsyncDistributedFileSystem, -* Need to call fixRelativePart. -* javadoc should describe return value. - testPermissionChecking should first create all files/directories synchronously, and then call setPermission/setOwner asynchronously in a separated loop. - Indeed, testAggressiveConcurrentAsyncAPI is a better test. Why don't we just keep it and remove testPermissionChecking? - I suggest to add one more test to mix up rename/setPermission/setOwner together. > Implement asynchronous setPermission/setOwner for DistributedFileSystem > --- > > Key: HDFS-10346 > URL: https://issues.apache.org/jira/browse/HDFS-10346 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, hdfs-client >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10346-HDFS-9924.000.patch, > HDFS-10346-HDFS-9924.001.patch, HDFS-10346-HDFS-9924.003.patch > > > This is proposed to implement an asynchronous setPermission and setOwner. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277030#comment-15277030 ] Rushabh S Shah commented on HDFS-10372: --- Thanks [~kihwal] for reviews and committing. Thanks [~xiaochen] and [~iwasakims] for reviews. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.7.3 > > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10372: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved (was: Patch Available) I've committed this to trunk through branch-2.7. Thanks for fixing this, [~shahrs87]. Thanks for valuable reviews, [~xiaochen] and [~iwasakims]. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 2.7.3 > > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277007#comment-15277007 ] Kihwal Lee edited comment on HDFS-10372 at 5/9/16 8:58 PM: --- [~iwasakims] has already +1'ed it, meaning the suggested change is not strictly necessary. I am committing this as is. was (Author: kihwal): [~iwasakims] has already +1'ed it, meaning the suggested change is strictly necessary. I am committing this as is. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9902) Support different values of dfs.datanode.du.reserved per storage type
[ https://issues.apache.org/jira/browse/HDFS-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-9902: Fix Version/s: 2.8.0 > Support different values of dfs.datanode.du.reserved per storage type > - > > Key: HDFS-9902 > URL: https://issues.apache.org/jira/browse/HDFS-9902 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Pan Yuxuan >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9902-02.patch, HDFS-9902-03.patch, > HDFS-9902-04.patch, HDFS-9902-05.patch, HDFS-9902.patch > > > Now Hadoop support different storage type for DISK, SSD, ARCHIVE and > RAM_DISK, but they share one configuration dfs.datanode.du.reserved. > The DISK size may be several TB and the RAM_DISK size may be only several > tens of GB. > The problem is that when I configure DISK and RAM_DISK (tmpfs) in the same > DN, and I set dfs.datanode.du.reserved values 10GB, this will waste a lot of > RAM_DISK size. > Since the usage of RAM_DISK can be 100%, so I don't want > dfs.datanode.du.reserved configured for DISK impacts the usage of tmpfs. > So can we make a new configuration for RAM_DISK or just skip this > configuration for RAM_DISK? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277007#comment-15277007 ] Kihwal Lee commented on HDFS-10372: --- [~iwasakims] has already +1'ed it, meaning the suggested change is strictly necessary. I am committing this as is. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10378) setOwner throws AccessControlException with wrong message
[ https://issues.apache.org/jira/browse/HDFS-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276994#comment-15276994 ] Yongjun Zhang commented on HDFS-10378: -- HI [~jzhuge], Thanks for reporting the issue and the patch. Couple of comments: 1. would you please provide single patch file that contains both the fix and unit test, instead of multiple files? 2. {{setOwner}} expects user to be superuser or belonging to superuser group. The call to {{fsd.checkOwner(pc, iip);}} appears to be non-op for this kind of user. Would you please look into whether we really need the call {{fsd.checkOwner(pc, iip);}} at all in {{setOwner}} method? Is being superuser both necessary and sufficient condition for this method? Thanks. > setOwner throws AccessControlException with wrong message > - > > Key: HDFS-10378 > URL: https://issues.apache.org/jira/browse/HDFS-10378 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Labels: supportability > Attachments: HDFS-10378-unit.patch, HDFS-10378.001.patch > > > Calling {{setOwner}} as a non-super user does trigger > {{AccessControlException}}, however, the message "Permission denied. > user=user1967821757 is not the owner of inode=child" is wrong. Expect this > message: "Non-super user cannot change owner". > Output of patched unit test {{TestPermission.testFilePermission}}: > {noformat} > 2016-05-06 16:45:44,915 [main] INFO security.TestPermission > (TestPermission.java:testFilePermission(280)) - GOOD: got > org.apache.hadoop.security.AccessControlException: Permission denied. > user=user1967821757 is not the owner of inode=child1 > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:273) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:250) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1642) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1626) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1595) > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:88) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1717) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:835) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:481) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:665) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1755) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417) > {noformat} > Will upload the unit test patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10260) TestFsDatasetImpl#testCleanShutdownOfVolume often fails
[ https://issues.apache.org/jira/browse/HDFS-10260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276911#comment-15276911 ] Rushabh S Shah commented on HDFS-10260: --- bq. FYI: It come up on my build By my machine, I meant my mac. I found the root cause for the failure. Please see HDFS-10372 for more details. > TestFsDatasetImpl#testCleanShutdownOfVolume often fails > --- > > Key: HDFS-10260 > URL: https://issues.apache.org/jira/browse/HDFS-10260 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, test >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-10260-v1.patch, HDFS-10260.patch > > > This test failure occurs in upstream Jenkins. Looking at the test code, I > think it should be improved to capture the root cause of failure: > E.g. change {{Thread.sleep(1000)}} to {{GenericTestUtils.waitFor}} and use > {{GenericTestUtils.assertExceptionContains}} to replace > {code} > Assert.assertTrue(ioe.getMessage().contains(info.toString())); > {code} > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/1062/testReport/junit/org.apache.hadoop.hdfs.server.datanode.fsdataset.impl/TestFsDatasetImpl/testCleanShutdownOfVolume/ > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testCleanShutdownOfVolume(TestFsDatasetImpl.java:683) > Standard Error > Exception in thread "DataNode: > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/, > > [DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]] > heartbeating to localhost/127.0.0.1:35113" java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockReports(FsDatasetImpl.java:1714) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdownBlockPool(FsDatasetImpl.java:2591) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1479) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:411) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:494) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:749) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276902#comment-15276902 ] Rushabh S Shah commented on HDFS-10372: --- bq. it's up to you to decide whether to improve it as Masatake Iwasaki suggested. I don't think this is required. I edited the test on my machine to create one file after one of the volume gone bad. It is able to create the file and I can see the block on the datanode's good volume. But I don't see any value adding it to patch. [~iwasakims]: Let me know if you still want me to create a new file. I will edit my patch. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations
[ https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276872#comment-15276872 ] Andrew Wang commented on HDFS-6489: --- I guess this means that the available space block placement policy is also broken then :( Agree that we need a methodical rework. I'm a +0.5 on this patch, it looks good to me, but would need to spend more time understanding the existing accounting before I could +1. If another reviewer more familiar with this area can also review, that'd be appreciated. > DFS Used space is not correct computed on frequent append operations > > > Key: HDFS-6489 > URL: https://issues.apache.org/jira/browse/HDFS-6489 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0, 2.7.1, 2.7.2 >Reporter: stanley shi >Assignee: Weiwei Yang > Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, > HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, > HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java > > > The current implementation of the Datanode will increase the DFS used space > on each block write operation. This is correct in most scenario (create new > file), but sometimes it will behave in-correct(append small data to a large > block). > For example, I have a file with only one block(say, 60M). Then I try to > append to it very frequently but each time I append only 10 bytes; > Then on each append, dfs used will be increased with the length of the > block(60M), not teh actual data length(10bytes). > Consider in a scenario I use many clients to append concurrently to a large > number of files (1000+), assume the block size is 32M (half of the default > value), then the dfs used will be increased 1000*32M = 32G on each append to > the files; but actually I only write 10K bytes; this will cause the datanode > to report in-sufficient disk space on data write. > {quote}2014-06-04 15:27:34,719 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock > BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received > exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: > Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, > FINALIZED{quote} > But the actual disk usage: > {quote} > [root@hdsh143 ~]# df -h > FilesystemSize Used Avail Use% Mounted on > /dev/sda3 16G 2.9G 13G 20% / > tmpfs 1.9G 72K 1.9G 1% /dev/shm > /dev/sda1 97M 32M 61M 35% /boot > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection
[ https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276857#comment-15276857 ] Hadoop QA commented on HDFS-10380: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 25s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 21s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 46s {color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 50s {color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 27s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 25s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 9s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 11s {color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 45s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803024/HDFS-10380.HDFS-8707.000.patch | | JIRA Issue | HDFS-10380 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 4d6709fd8099 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / d187112 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15398/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15398/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > libhdfs++: Get rid of socket template parameter in RpcConnection > - > > Key: HDFS-10380 > URL: https://issues.apache.org/jira/browse/HDFS-10380 >
[jira] [Updated] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor updated HDFS-10287: Attachment: HDFS-10287.02.patch Test failure seems unrelated. Another JIRA was reported regarding this failure: HDFS-10260 Just as double check triggering build again with [^HDFS-10287.02.patch] > MiniDFSCluster should implement AutoCloseable > - > > Key: HDFS-10287 > URL: https://issues.apache.org/jira/browse/HDFS-10287 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: John Zhuge >Assignee: Andras Bokor >Priority: Trivial > Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch > > > {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support > [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]. > It will make test code a little cleaner and more reliable. > Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be > backported to Hadoop version prior to 2.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10260) TestFsDatasetImpl#testCleanShutdownOfVolume often fails
[ https://issues.apache.org/jira/browse/HDFS-10260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276863#comment-15276863 ] Andras Bokor commented on HDFS-10260: - bq. I am not able to make it fail on my machine. It never failed on our internal Jenkins server also. FYI: It come up on my [build|https://builds.apache.org/job/PreCommit-HDFS-Build/15397/testReport/] > TestFsDatasetImpl#testCleanShutdownOfVolume often fails > --- > > Key: HDFS-10260 > URL: https://issues.apache.org/jira/browse/HDFS-10260 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, test >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Fix For: 2.8.0 > > Attachments: HDFS-10260-v1.patch, HDFS-10260.patch > > > This test failure occurs in upstream Jenkins. Looking at the test code, I > think it should be improved to capture the root cause of failure: > E.g. change {{Thread.sleep(1000)}} to {{GenericTestUtils.waitFor}} and use > {{GenericTestUtils.assertExceptionContains}} to replace > {code} > Assert.assertTrue(ioe.getMessage().contains(info.toString())); > {code} > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/1062/testReport/junit/org.apache.hadoop.hdfs.server.datanode.fsdataset.impl/TestFsDatasetImpl/testCleanShutdownOfVolume/ > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testCleanShutdownOfVolume(TestFsDatasetImpl.java:683) > Standard Error > Exception in thread "DataNode: > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/, > > [DISK]file:/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]] > heartbeating to localhost/127.0.0.1:35113" java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockReports(FsDatasetImpl.java:1714) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdownBlockPool(FsDatasetImpl.java:2591) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1479) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:411) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:494) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:749) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9545) DiskBalancer : Add Plan Command
[ https://issues.apache.org/jira/browse/HDFS-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-9545: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > DiskBalancer : Add Plan Command > --- > > Key: HDFS-9545 > URL: https://issues.apache.org/jira/browse/HDFS-9545 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9545-HDFS-1312.001.patch, > HDFS-9545-HDFS-1312.002.patch > > > Allows user to create a Plan and persist it. This is useful if the users want > to evaluate the actions of disk balancer before running the balancing job -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9545) DiskBalancer : Add Plan Command
[ https://issues.apache.org/jira/browse/HDFS-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276834#comment-15276834 ] Anu Engineer commented on HDFS-9545: [~eddyxu] Thanks for the code review. I have committed this to the feature branch. > DiskBalancer : Add Plan Command > --- > > Key: HDFS-9545 > URL: https://issues.apache.org/jira/browse/HDFS-9545 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-9545-HDFS-1312.001.patch, > HDFS-9545-HDFS-1312.002.patch > > > Allows user to create a Plan and persist it. This is useful if the users want > to evaluate the actions of disk balancer before running the balancing job -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10381) DataNode exclusion log message should be a warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276825#comment-15276825 ] Mingliang Liu commented on HDFS-10381: -- Is this for {{DataStreamer}}? Perhaps we can set the component field as {{hdfs-client}}. > DataNode exclusion log message should be a warning > -- > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10360) DataNode may format directory and lose blocks if current/VERSION is missing
[ https://issues.apache.org/jira/browse/HDFS-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-10360: --- Attachment: HDFS-10360.002.patch Rev02: add log to record what files are being deleted during a DataNode reformat operation. There's no need to add extra JMX reporting code, as long as an IOException thrown adding the volume, it is added into JMX by the existing code. The test failures seems unrelated. I don't have these test failures in my tree. > DataNode may format directory and lose blocks if current/VERSION is missing > --- > > Key: HDFS-10360 > URL: https://issues.apache.org/jira/browse/HDFS-10360 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10360.001.patch, HDFS-10360.002.patch > > > Under certain circumstances, if the current/VERSION of a storage directory is > missing, DataNode may format the storage directory even though _block files > are not missing_. > This is very easy to reproduce. Simply launch a HDFS cluster and create some > files. Delete current/VERSION, and restart the data node. > After the restart, the data node will format the directory and remove all > existing block files: > {noformat} > 2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: > Lock on /data/dfs/dn/in_use.lock acquired by nodename > 5...@weichiu-dn-2.vpc.cloudera.com > 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: > Storage directory /data/dfs/dn is not formatted for > BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting ... > 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: > Analyzing storage directories for bpid BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: > Locking is disabled for > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Block pool storage directory > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted > for BP-787466439-172 > .26.24.43-1462305406642 > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting ... > 2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: > Formatting block pool BP-787466439-172.26.24.43-1462305406642 directory > /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current > {noformat} > The bug is: DataNode assumes that if none of {{current/VERSION}}, > {{previous/}}, {{previous.tmp/}}, {{removed.tmp/}}, {{finalized.tmp/}} and > {{lastcheckpoint.tmp/}} exists, the storage directory contains nothing > important to HDFS and decides to format it. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L526-L545 > However, block files may still exist, and in my opinion, we should do > everything possible to retain the block files. > I have two suggestions: > # check if {{current/}} directory is empty. If not, throw an > InconsistentFSStateException in {{Storage#analyzeStorage}} instead of > asumming its not formatted. Or, > # In {{Storage#clearDirectory}}, before it formats the storage directory, > rename or move {{current/}} directory. Also, log whatever is being > renamed/moved. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276794#comment-15276794 ] Rushabh S Shah commented on HDFS-10372: --- bq. The test expected that the message in exception on out.close() contains the name of failed volume (to which the replica was written) but it contained only info about live volume (data2). When the client asked for locations for first block, namenode selected a datanode with any random storage info within that datanode. Refer to {{DataStreamer.locateFollowingBlock(DatanodeInfo[] excludedNodes)}} method for more details. When the client started writing to datanode, the datanode selects a volume accordingto RoundRobinVolumeChoosingPolicy policy and it can select a storage which is different than what namenode has stored in its triplets. When the datanode sends an IBR (with RECIEVING_BLOCK), the namenode will change the storage info in its triplets with the storage info which datanode reported. But the change in storage info is not propogated back to client. So the client still has stale storage info. When the client tried to close the file, the datanode threw an exception (since the volume has gone bad) but since the client has stale storage info, it saved the exception with the old storage info. This is the reason why the test was flaky in the first place. In my machine, the test finishes within 2 seconds. So the datanode didn't send any IBR and the storage info was not changed in namenode. But in the jenkins build machines, the test ran for more than 8 seconds which gave datanode ample of time to send an IBR. [~iwasakims]: I hope this answers your question. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-10381) DataNode exclusion log message should be a warning
[ https://issues.apache.org/jira/browse/HDFS-10381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-10381 started by John Zhuge. - > DataNode exclusion log message should be a warning > -- > > Key: HDFS-10381 > URL: https://issues.apache.org/jira/browse/HDFS-10381 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > > When adding a DN to {{excludedNodes}}, it should log a warning message > instead of info. > {code} > success = createBlockOutputStream(nodes, storageTypes, 0L, false); > if (!success) { > LOG.info("Abandoning " + block); > dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, > dfsClient.clientName); > block = null; > final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; > LOG.info("Excluding datanode " + badNode); > excludedNodes.put(badNode, badNode); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10381) DataNode exclusion log message should be a warning
John Zhuge created HDFS-10381: - Summary: DataNode exclusion log message should be a warning Key: HDFS-10381 URL: https://issues.apache.org/jira/browse/HDFS-10381 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: John Zhuge Assignee: John Zhuge Priority: Minor When adding a DN to {{excludedNodes}}, it should log a warning message instead of info. {code} success = createBlockOutputStream(nodes, storageTypes, 0L, false); if (!success) { LOG.info("Abandoning " + block); dfsClient.namenode.abandonBlock(block, stat.getFileId(), src, dfsClient.clientName); block = null; final DatanodeInfo badNode = nodes[errorState.getBadNodeIndex()]; LOG.info("Excluding datanode " + badNode); excludedNodes.put(badNode, badNode); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276767#comment-15276767 ] Hadoop QA commented on HDFS-10287: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 new + 208 unchanged - 3 fixed = 208 total (was 211) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 37s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 58s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 136m 57s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12801035/HDFS-10287.01.patch | | JIRA Issue | HDFS-10287 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 5cdb611217f9 3.13.0-36-lowlatency #63-Ubu
[jira] [Updated] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection
[ https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10380: --- Status: Patch Available (was: Open) > libhdfs++: Get rid of socket template parameter in RpcConnection > - > > Key: HDFS-10380 > URL: https://issues.apache.org/jira/browse/HDFS-10380 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10380.HDFS-8707.000.patch > > > RpcConnection is always templated on asio::ip::tcp::socket except for in > rpc_engine_test.cc. My understanding is the original reason for using this > as a template parameter was to be able to use trait injection in gmock tests. > This is useful for testing but makes debugging a lot more tricky due to the > extra stuff that shows up on the stack. Heavily templated code also tends to > produce very unhelpful compile errors e.g. when a missing semicolon in a > templated class turns into pages of errors coming out of stuff that depended > on the instantiation. > This sort of work was already accomplished elsewhere by HDFS-9144, it looks > like RpcConnection was one of the few areas they were left in place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection
[ https://issues.apache.org/jira/browse/HDFS-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10380: --- Attachment: HDFS-10380.HDFS-8707.000.patch First quick pass: -got rid of RpcConnection template parameter -typedefed asio::ip::tcp::socket as rpc_socket_t, could just be socket_t. -renamed RpcConnectionImpl::next_layer_ to RpcConnectionImpl::socket_ to make the code a bit more self explanatory. todo: -Was in a hurry to get this building so just declared all of the methods in rpc_connection.h as inline. Will move to rpc_connection.cc. -rpc_engine_test (the test the template was there for) needs a lot of work to be able to run with this change. It's fairly dense and has nearly no comments so it might be easier to just rewrite it. > libhdfs++: Get rid of socket template parameter in RpcConnection > - > > Key: HDFS-10380 > URL: https://issues.apache.org/jira/browse/HDFS-10380 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10380.HDFS-8707.000.patch > > > RpcConnection is always templated on asio::ip::tcp::socket except for in > rpc_engine_test.cc. My understanding is the original reason for using this > as a template parameter was to be able to use trait injection in gmock tests. > This is useful for testing but makes debugging a lot more tricky due to the > extra stuff that shows up on the stack. Heavily templated code also tends to > produce very unhelpful compile errors e.g. when a missing semicolon in a > templated class turns into pages of errors coming out of stuff that depended > on the instantiation. > This sort of work was already accomplished elsewhere by HDFS-9144, it looks > like RpcConnection was one of the few areas they were left in place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10380) libhdfs++: Get rid of socket template parameter in RpcConnection
James Clampffer created HDFS-10380: -- Summary: libhdfs++: Get rid of socket template parameter in RpcConnection Key: HDFS-10380 URL: https://issues.apache.org/jira/browse/HDFS-10380 Project: Hadoop HDFS Issue Type: Sub-task Reporter: James Clampffer Assignee: James Clampffer RpcConnection is always templated on asio::ip::tcp::socket except for in rpc_engine_test.cc. My understanding is the original reason for using this as a template parameter was to be able to use trait injection in gmock tests. This is useful for testing but makes debugging a lot more tricky due to the extra stuff that shows up on the stack. Heavily templated code also tends to produce very unhelpful compile errors e.g. when a missing semicolon in a templated class turns into pages of errors coming out of stuff that depended on the instantiation. This sort of work was already accomplished elsewhere by HDFS-9144, it looks like RpcConnection was one of the few areas they were left in place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7212) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time
[ https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276614#comment-15276614 ] Chris Nauroth commented on HDFS-7212: - Hello [~rajeshhadoop]. Based on these symptoms, you'll probably be interested in multiple bug fixes in short-circuit read that went in after Apache Hadoop 2.4.0: HADOOP-11333, HADOOP-11604, HADOOP-11648, HADOOP-11802 and HDFS-8429. > FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a > very long time > - > > Key: HDFS-7212 > URL: https://issues.apache.org/jira/browse/HDFS-7212 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 > Environment: PROD >Reporter: Istvan Szukacs > > There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM > and rendering the service unusable, missing heartbeats and stopping data > access. The threads look like this: > {code} > 3415 (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may > be imprecise) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=186 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() > @bci=1, line=834 (Interpreted frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, > int) @bci=67, line=867 (Interpreted frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, > line=1197 (Interpreted frame) > - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, > line=214 (Compiled frame) > - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled > frame) > - > org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket, > org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 > (Interpreted frame) > - > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String, > org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted > frame) > - > org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String) > @bci=212, line=413 (Interpreted frame) > - > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream) > @bci=13, line=172 (Interpreted frame) > - > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op) > @bci=149, line=92 (Compiled frame) > - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 > (Compiled frame) > - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame) > {code} > Has anybody seen this before? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10322) DomainSocket error lead to more and more DataNode thread waiting
[ https://issues.apache.org/jira/browse/HDFS-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-10322: - Summary: DomainSocket error lead to more and more DataNode thread waiting (was: DomianSocket error lead to more and more DataNode thread waiting ) > DomainSocket error lead to more and more DataNode thread waiting > - > > Key: HDFS-10322 > URL: https://issues.apache.org/jira/browse/HDFS-10322 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.5.0 >Reporter: ChenFolin > Fix For: 2.6.4 > > > When open short read and a DomianSoket broken pipe error happened,The > Datanode will produce more and more waiting threads. > It is similar to Bug HADOOP-11802, but i do not think they are same problem, > because the DomainSocket thread is in Running state. > stack log: > "DataXceiver for client unix:/var/run/hadoop-hdfs/dn.50010 [Waiting for > operation #1]" daemon prio=10 tid=0x0278e000 nid=0x2bc6 waiting on > condition [0x7f2d6e4a5000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00061c493500> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:316) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:394) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:178) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:93) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) > =DomianSocketWatcher > "Thread-759187" daemon prio=10 tid=0x0219c800 nid=0x8c56 runnable > [0x7f2dbe4cb000] >java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$1.run(DomainSocketWatcher.java:474) > at java.lang.Thread.run(Thread.java:745) > ===datanode error log > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: > datanode-:50010:DataXceiver error processing REQUEST_SHORT_CIRCUIT_SHM > operation src: unix:/var/run/hadoop-hdfs/dn.50010 dst: > java.net.SocketException: write(2) error: Broken pipe > at org.apache.hadoop.net.unix.DomainSocket.writeArray0(Native Method) > at org.apache.hadoop.net.unix.DomainSocket.access$300(DomainSocket.java:45) > at > org.apache.hadoop.net.unix.DomainSocket$DomainOutputStream.write(DomainSocket.java:601) > at > com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) > at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) > at > com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:91) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.sendShmSuccessResponse(DataXceiver.java:371) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:409) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:178) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:93) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6489) DFS Used space is not correct computed on frequent append operations
[ https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-6489: --- Attachment: HDFS-6489.007.patch Andrew! Thanks for the review. Here's a patch with the changes you suggest. I am sticking with conditional for RBW for now because 1. Its the common case, 2. RUR doesn't have {{originalBytesReserved}}. Even with this, {{dfsUsed}} and {{numblocks}} counting is all messed up. e.g. [FsDatasetImpl.removeOldBlock|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L2886] calls {{decDfsUsedAndNumBlocks}} twice (so even though {{dfsUsed}} is correctly decremented, {{numBlocks}} is not) . [~brahmareddy], [~vinayrpet] am I reading this right? To really get this sorted out, we should probably have a unit test framework that tests DN side accounting is proper for several different operations (block creation, appends, block transfer (e.g. from 1 storage to another), etc.). Unfortunately, I don't think I'll have the cycles for that. Sorry :( Arpit has seemed amenable to [removing reservations|https://issues.apache.org/jira/browse/HDFS-9530?focusedCommentId=15248968&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15248968] if there is some other alternative. I think we should disable the rejection of writes by DNs based on reservations until we can be sure that our accounting is correct. Just my $0.02 > DFS Used space is not correct computed on frequent append operations > > > Key: HDFS-6489 > URL: https://issues.apache.org/jira/browse/HDFS-6489 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0, 2.7.1, 2.7.2 >Reporter: stanley shi >Assignee: Weiwei Yang > Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, > HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, > HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java > > > The current implementation of the Datanode will increase the DFS used space > on each block write operation. This is correct in most scenario (create new > file), but sometimes it will behave in-correct(append small data to a large > block). > For example, I have a file with only one block(say, 60M). Then I try to > append to it very frequently but each time I append only 10 bytes; > Then on each append, dfs used will be increased with the length of the > block(60M), not teh actual data length(10bytes). > Consider in a scenario I use many clients to append concurrently to a large > number of files (1000+), assume the block size is 32M (half of the default > value), then the dfs used will be increased 1000*32M = 32G on each append to > the files; but actually I only write 10K bytes; this will cause the datanode > to report in-sufficient disk space on data write. > {quote}2014-06-04 15:27:34,719 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock > BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received > exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: > Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, > FINALIZED{quote} > But the actual disk usage: > {quote} > [root@hdsh143 ~]# df -h > FilesystemSize Used Avail Use% Mounted on > /dev/sda3 16G 2.9G 13G 20% / > tmpfs 1.9G 72K 1.9G 1% /dev/shm > /dev/sda1 97M 32M 61M 35% /boot > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor reassigned HDFS-10287: --- Assignee: Andras Bokor (was: John Zhuge) > MiniDFSCluster should implement AutoCloseable > - > > Key: HDFS-10287 > URL: https://issues.apache.org/jira/browse/HDFS-10287 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: John Zhuge >Assignee: Andras Bokor >Priority: Trivial > Attachments: HDFS-10287.01.patch > > > {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support > [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]. > It will make test code a little cleaner and more reliable. > Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be > backported to Hadoop version prior to 2.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor updated HDFS-10287: Status: Patch Available (was: Open) > MiniDFSCluster should implement AutoCloseable > - > > Key: HDFS-10287 > URL: https://issues.apache.org/jira/browse/HDFS-10287 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Attachments: HDFS-10287.01.patch > > > {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support > [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]. > It will make test code a little cleaner and more reliable. > Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be > backported to Hadoop version prior to 2.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10287) MiniDFSCluster should implement AutoCloseable
[ https://issues.apache.org/jira/browse/HDFS-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276434#comment-15276434 ] Andras Bokor commented on HDFS-10287: - I see your point about the tests. These tests were added with HDFS-2209. Before this patch {{MiniDFSCluster}} used only {{test.build.data}} system property to determine the base directory. Now the base directory can be set through config object. Based on the comments on HDFS-2209 before the patch to create two instance in the same JVM was not possible (that is not 100% clear to me why. For me it seems setting the system property between the two cluster initialization should work.). It seems to me {{testDualCluster}} is a proof of concept that the new feature works. But indeed, the first test proves that the {{MiniDFSCluster}} uses {{hdfs.minidfs.basedir}} property well. What is your suggestion? Leave as it is or modify/remove? > MiniDFSCluster should implement AutoCloseable > - > > Key: HDFS-10287 > URL: https://issues.apache.org/jira/browse/HDFS-10287 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Attachments: HDFS-10287.01.patch > > > {{MiniDFSCluster}} should implement {{AutoCloseable}} in order to support > [try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]. > It will make test code a little cleaner and more reliable. > Since {{AutoCloseable}} is only in Java 1.7 or later, this can not be > backported to Hadoop version prior to 2.7. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276409#comment-15276409 ] Hadoop QA commented on HDFS-10333: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 3s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 54m 45s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 136m 57s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802974/HDFS-10333.001.patch | | JIRA Issue | HDFS-10333 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux beb0d668ac24 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git re
[jira] [Commented] (HDFS-10372) Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
[ https://issues.apache.org/jira/browse/HDFS-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276350#comment-15276350 ] Kihwal Lee commented on HDFS-10372: --- [~shahrs87], it's up to you to decide whether to improve it as [~iwasakims] suggested. > Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume > --- > > Key: HDFS-10372 > URL: https://issues.apache.org/jira/browse/HDFS-10372 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.3 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: HDFS-10372.patch > > > TestFsDatasetImpl#testCleanShutdownOfVolume fails very often. > We added more debug information in HDFS-10260 to find out why this test is > failing. > Now I think I know the root cause of failure. > I thought that {{LocatedBlock#getLocations()}} returns an array of > DatanodeInfo but now I realized that it returns an array of > DatandeStorageInfo (which is subclass of DatanodeInfo). > In the test I intended to check whether the exception contains the xfer > address of the DatanodeInfo. Since {{DatanodeInfo#toString()}} method returns > the xfer address, I checked whether exception contains > {{DatanodeInfo#toString}} or not. > But since {{LocatedBlock#getLocations()}} returned an array of > DatanodeStorageInfo, it has storage info in the toString() implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276285#comment-15276285 ] Yiqun Lin commented on HDFS-10333: -- Sorry, I correct the opinion before: {quote} The test will failed once the socket io timeout happens {quote} This should be modify as The test will has a more chance to fail once socket io timeout frquently happen over a period of time. > Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk > --- > > Key: HDFS-10333 > URL: https://issues.apache.org/jira/browse/HDFS-10333 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang >Assignee: Yiqun Lin > Attachments: HDFS-10333.001.patch > > > Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25): > {code} > -- > T E S T S > --- > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.hdfs.TestFileAppend > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 3.674 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > However, when I run with Java1.7, the test is sometimes successful, and it > sometimes fails with > {code} > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 9.099 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > The failure of this test is intermittent, but it fails pretty often. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin reassigned HDFS-10333: Assignee: Yiqun Lin > Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk > --- > > Key: HDFS-10333 > URL: https://issues.apache.org/jira/browse/HDFS-10333 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang >Assignee: Yiqun Lin > Attachments: HDFS-10333.001.patch > > > Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25): > {code} > -- > T E S T S > --- > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.hdfs.TestFileAppend > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 3.674 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > However, when I run with Java1.7, the test is sometimes successful, and it > sometimes fails with > {code} > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 9.099 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > The failure of this test is intermittent, but it fails pretty often. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10333: - Status: Patch Available (was: Open) > Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk > --- > > Key: HDFS-10333 > URL: https://issues.apache.org/jira/browse/HDFS-10333 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang >Assignee: Yiqun Lin > Attachments: HDFS-10333.001.patch > > > Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25): > {code} > -- > T E S T S > --- > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.hdfs.TestFileAppend > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 3.674 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > However, when I run with Java1.7, the test is sometimes successful, and it > sometimes fails with > {code} > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 9.099 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > The failure of this test is intermittent, but it fails pretty often. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10333: - Attachment: HDFS-10333.001.patch > Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk > --- > > Key: HDFS-10333 > URL: https://issues.apache.org/jira/browse/HDFS-10333 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang >Assignee: Yiqun Lin > Attachments: HDFS-10333.001.patch > > > Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25): > {code} > -- > T E S T S > --- > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.hdfs.TestFileAppend > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 3.674 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > However, when I run with Java1.7, the test is sometimes successful, and it > sometimes fails with > {code} > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 41.32 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 9.099 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:49006,DS-498240fa-d1c7-4ba1-b97e-a1761cbbefa5,DISK], > > DatanodeInfoWithStorage[127.0.0.1:43097,DS-b83b49ce-fc14-4b9e-a3fc-7df2cd9fc753,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1162) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1423) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1338) > at > org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1321) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:599) > {code} > The failure of this test is intermittent, but it fails pretty often. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10333) Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk
[ https://issues.apache.org/jira/browse/HDFS-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276253#comment-15276253 ] Yiqun Lin commented on HDFS-10333: -- I tested this test and other similar tests in {{TestFileAppend}} in my local. I found connection refused happed sometimes due to socket io timeout. This are logs {code} 2016-05-09 17:33:49,828 [DataXceiver for client DFSClient_NONMAPREDUCE_140749666_1 at /127.0.0.1:58040 [Receiving block BP-2032095287-127.0.0.1-1462786332089:blk_1073741827_1003]] ERROR datanode.DataNode (DataXceiver.java:run(316)) - 127.0.0.1:58021:DataXceiver error processing WRITE_BLOCK operation src: /127.0.0.1:58040 dst: /127.0.0.1:58021 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:746) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:171) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:105) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:289) at java.lang.Thread.run(Thread.java:745) {code} And then this will cause IOException and set first target node as a bad node(actually it means datanode that failed in connection setup)into the response. {code} } catch (IOException e) { if (isClient) { BlockOpResponseProto.newBuilder() .setStatus(ERROR) // NB: Unconditionally using the xfer addr w/o hostname .setFirstBadLink(targets[0].getXferAddr()) .build() .writeDelimitedTo(replyOut); replyOut.flush(); } {code} And this node index will be set into errorState and then marked this node as bad node. The code will not return false here {code} if (!errorState.hasDatanodeError() && !shouldHandleExternalError()) { return false; } {code} ,then continue to execute {{setupPipelineForAppendOrRecovery}} in {{processDatanodeOrExternalError}}. And finally it will print the msg "no more good datanodes being available to replace a bad datanode on the existing pipeline". So we should disable the property {{dfs.client.block.write.replace-datanode-on-failure.enable}}, and there is no need to set policy of {{dfs.client.block.write.replace-datanode-on-failure.policy}} here. The test will failed once the socket io timeout happens. In addition, the similar test {{TestFileAppend#testMultiAppend2}} has already disabled this. I will post a patch for this later, thanks review. > Intermittent org.apache.hadoop.hdfs.TestFileAppend failure in trunk > --- > > Key: HDFS-10333 > URL: https://issues.apache.org/jira/browse/HDFS-10333 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang > > Java8 (I used JAVA_HOME=/opt/toolchain/jdk1.8.0_25): > {code} > -- > T E S T S > --- > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.hdfs.TestFileAppend > Tests run: 12, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.75 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.TestFileAppend > testMultipleAppends(org.apache.hadoop.hdfs.TestFileAppend) Time elapsed: > 3.674 sec <<< ERROR! > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]], > > original=[DatanodeInfoWithStorage[127.0.0.1:43067,DS-cf80da41-3697-4afa-8f89-93693cd5035d,DISK], > > DatanodeInfoWithStorage[127.0.0.1:32946,DS-3b08422c-959e-42f0-a624-91b2524c4371,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1166) > at > org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1232) > at > org.apache.hadoop.hdfs.DataStreamer.handleDatano
[jira] [Commented] (HDFS-9803) Proactively refresh ShortCircuitCache entries to avoid latency spikes
[ https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276176#comment-15276176 ] Jery ma commented on HDFS-9803: --- Let's say for example that the application does not read all the way through to the tenth block within the block access token expiration. At that point, the client would recover internally by issuing another RPC to the NameNode to fetch a fresh block access token with a new expiration. Then, the read would continue as normal in fact,the regionerver doesn't fetch a fresh token > Proactively refresh ShortCircuitCache entries to avoid latency spikes > - > > Key: HDFS-9803 > URL: https://issues.apache.org/jira/browse/HDFS-9803 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Nick Dimiduk > > My region server logs are flooding with messages like > "SecretManager$InvalidToken: access control error while attempting to set up > short-circuit access to ... is expired". These logs > correspond with responseTooSlow WARNings from the region server. > {noformat} > 2016-01-19 22:10:14,432 INFO > [B.defaultRpcServer.handler=4,queue=1,port=16020] > shortcircuit.ShortCircuitCache: ShortCircuitCache(0x71bdc547): could not load > 1074037633_BP-1145309065-XXX-1448053136416 due to InvalidToken exception. > org.apache.hadoop.security.token.SecretManager$InvalidToken: access control > error while attempting to set up short-circuit access to token > with block_token_identifier (expiryDate=1453194430724, keyId=1508822027, > userId=hbase, blockPoolId=BP-1145309065-XXX-1448053136416, > blockId=1074037633, access modes=[READ]) is expired. > at > org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591) > at > org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) > at java.io.DataInputStream.read(DataInputStream.java:149) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437) > ... > {noformat} > A potential solution could be to have a background thread that makes a best > effort to proactively refreshes tokens in the cache before they expire, so as > to minimize latency impact on the critical path. > Thanks to [~cnauroth] for providing an explaination and suggesting a solution > over on the [user > list|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker
[ https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-8449: Attachment: HDFS-8449-v11.patch v11 fix checkstyle problems > Add tasks count metrics to datanode for ECWorker > > > Key: HDFS-8449 > URL: https://issues.apache.org/jira/browse/HDFS-8449 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, > HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, > HDFS-8449-005.patch, HDFS-8449-006.patch, HDFS-8449-007.patch, > HDFS-8449-008.patch, HDFS-8449-009.patch, HDFS-8449-010.patch, > HDFS-8449-v10.patch, HDFS-8449-v11.patch > > > This sub task try to record ec recovery tasks that a datanode has done, > including total tasks, failed tasks and sucessful tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker
[ https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276084#comment-15276084 ] Hadoop QA commented on HDFS-8449: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 17 new + 97 unchanged - 3 fixed = 114 total (was 100) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 53s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 19s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 152m 40s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestAsyncDFSRename | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.server.namenode.TestCacheDirectives | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802921/HDFS-8449-v10.patch | | JIRA Issue | HDFS-8449 | | Optional Tests | as
[jira] [Commented] (HDFS-7212) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time
[ https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276037#comment-15276037 ] Rajesh Chandramohan commented on HDFS-7212: --- Hi , We have this similar issue , where DN sockets getting stuck. ( Hadoop version 2.4.0) === # lsof /var/run/hadoop-hdfs/dn | wc -l 6876 jstack shows == Thread 5295: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame) - org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket, org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=99, line=316 (Interpreted frame) - org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String, org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted frame) - org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String) @bci=212, line=413 (Interpreted frame) - org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream) @bci=13, line=172 (Interpreted frame) - org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op) @bci=149, line=92 (Compiled frame) - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 (Compiled frame) - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) All most 6k threads in this state, after a while DN gets hung. Do we have details regarding this issue. Any patch available for it. Looks like similar kind of issue. > FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a > very long time > - > > Key: HDFS-7212 > URL: https://issues.apache.org/jira/browse/HDFS-7212 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 > Environment: PROD >Reporter: Istvan Szukacs > > There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM > and rendering the service unusable, missing heartbeats and stopping data > access. The threads look like this: > {code} > 3415 (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may > be imprecise) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=186 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() > @bci=1, line=834 (Interpreted frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, > int) @bci=67, line=867 (Interpreted frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, > line=1197 (Interpreted frame) > - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, > line=214 (Compiled frame) > - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled > frame) > - > org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket, > org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 > (Interpreted frame) > - > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String, > org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted > frame) > - > org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String) > @bci=212, line=413 (Interpreted frame) > - > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream) > @bci=13, line=172 (Interpreted frame) > - > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op) > @bci=149, line=92 (Compiled frame) > - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 > (Compiled frame) > - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame) > {code} > Has anybody seen this before? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org