[jira] [Commented] (HDDS-965) Ozone: checkstyle improvements and code quality scripts
[ https://issues.apache.org/jira/browse/HDDS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738667#comment-16738667 ] Sean Busbey commented on HDDS-965: -- I'll get a DISCUSS thread going sometime in the next week or two as I finish reflecting on things. > Ozone: checkstyle improvements and code quality scripts > --- > > Key: HDDS-965 > URL: https://issues.apache.org/jira/browse/HDDS-965 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Experimental scripts to test github pr capabilities after the github url > move. The provided scripts are easier to use locally and provides more > strict/focused checks then the existing pre-commit scripts. But this is not a > replacements of the existing yetus build as it adds additional (more strict) > checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-965) Ozone: checkstyle improvements and code quality scripts
[ https://issues.apache.org/jira/browse/HDDS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737727#comment-16737727 ] Sean Busbey commented on HDDS-965: -- If we're one project, we should have a common methodology to things like tests so that folks don't need to learn and track different ones. If Ozone's future is an independent project then we should be aimed there from the start, e.g. with its own repo and treated as a podling if not outright in the incubator. I'll go back and re-read the original discussion and community promises, then start a DISCUSS thread since I think this needs more project visibility then it's going to get on some random build quality improvement JIRA (as I am painfully aware of how few other folks track those). > Ozone: checkstyle improvements and code quality scripts > --- > > Key: HDDS-965 > URL: https://issues.apache.org/jira/browse/HDDS-965 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Experimental scripts to test github pr capabilities after the github url > move. The provided scripts are easier to use locally and provides more > strict/focused checks then the existing pre-commit scripts. But this is not a > replacements of the existing yetus build as it adds additional (more strict) > checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-965) Ozone: checkstyle improvements and code quality scripts
[ https://issues.apache.org/jira/browse/HDDS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737661#comment-16737661 ] Sean Busbey commented on HDDS-965: -- Can you point me to a place to discuss this? It seems a waste to bifrucate our efforts this way. If Ozone is a part of the Hadoop project, we should keep our test efforts consolidated. > Ozone: checkstyle improvements and code quality scripts > --- > > Key: HDDS-965 > URL: https://issues.apache.org/jira/browse/HDDS-965 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Experimental scripts to test github pr capabilities after the github url > move. The provided scripts are easier to use locally and provides more > strict/focused checks then the existing pre-commit scripts. But this is not a > replacements of the existing yetus build as it adds additional (more strict) > checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13825) HDFS Uses very outdated okhttp library
[ https://issues.apache.org/jira/browse/HDFS-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581491#comment-16581491 ] Sean Busbey commented on HDFS-13825: [~BenParker25] is this problem still present if you rely on the {{hadoop-client-api}} / {{hadoop-client-runtime}} jars? those are meant to be the downstream facing interface for HDFS. > HDFS Uses very outdated okhttp library > -- > > Key: HDFS-13825 > URL: https://issues.apache.org/jira/browse/HDFS-13825 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.3 >Reporter: Ben Parker >Priority: Minor > > HDFS Client uses okHttp library version 2.7.4 which is two years out of date. > [https://mvnrepository.com/artifact/com.squareup.okhttp/okhttp] > The updates for this library have been moved to a new package here: > [https://mvnrepository.com/artifact/com.squareup.okhttp3/okhttp] > > This causes dependancy management problems for services that use HDFS. > For example trying to use okHttp in code that runs on Amazon EMR gives you > Method not found errors due to the new version being kicked out in favour of > the one used by HDFS. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13702) HTrace hooks taking 10-15% CPU in DFS client when disabled
[ https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525528#comment-16525528 ] Sean Busbey commented on HDFS-13702: {quote} Then, when it ran the hadoop-hdfs tests, they ran against the trunk snapshot build rather than the patched snapshot build, and failed to compile. Sean's going to file a YETUS JIRA about this and re-submit the patch build here. {quote} I wanna take a look at Hadoop's pom. ideally we shouldn't have the ASF snapshot repo as a source of artifacts at all. I'm not sure there's a way for yetus to proactively defend against it. (maybe use maven's {{--offline}} for build steps after the one where we say we're downloading the world?) > HTrace hooks taking 10-15% CPU in DFS client when disabled > -- > > Key: HDFS-13702 > URL: https://issues.apache.org/jira/browse/HDFS-13702 > Project: Hadoop HDFS > Issue Type: Bug > Components: performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch > > > I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate > workload even when HTrace is disabled. This is because it stringifies several > integers. We should avoid all allocation and stringification when htrace is > disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13702) HTrace hooks taking 10-15% CPU in DFS client when disabled
[ https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525525#comment-16525525 ] Sean Busbey commented on HDFS-13702: well I started another run but the Hadoop related precommit jobs don't have a debug flag so I'll need to add one. > HTrace hooks taking 10-15% CPU in DFS client when disabled > -- > > Key: HDFS-13702 > URL: https://issues.apache.org/jira/browse/HDFS-13702 > Project: Hadoop HDFS > Issue Type: Bug > Components: performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch > > > I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate > workload even when HTrace is disabled. This is because it stringifies several > integers. We should avoid all allocation and stringification when htrace is > disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13702) HTrace hooks taking 10-15% CPU in DFS client when disabled
[ https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525521#comment-16525521 ] Sean Busbey commented on HDFS-13702: Todd got me to the log {code} [INFO] - [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedBlockReader.java:[128,30] error: method newBlockReader in class BlockReaderRemote cannot be applied to given types; [INFO] 1 error [INFO] - [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 26.030 s [INFO] Finished at: 2018-06-27T00:06:25+00:00 [INFO] Final Memory: 27M/570M [INFO] [WARNING] The requested profile "native" could not be activated because it does not exist. [WARNING] The requested profile "yarn-ui" could not be activated because it does not exist. [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-hdfs: Compilation failure [ERROR] /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedBlockReader.java:[128,30] error: method newBlockReader in class BlockReaderRemote cannot be applied to given types; {code} we've been chatting and it definitely looks like we run against the wrong artifact. Todd has a plausible theory that maybe a concurrent run of the snapshot publisher happens to land in the asf snapshot repo between when we did test-compile and here. I'm going to rerun in debug. > HTrace hooks taking 10-15% CPU in DFS client when disabled > -- > > Key: HDFS-13702 > URL: https://issues.apache.org/jira/browse/HDFS-13702 > Project: Hadoop HDFS > Issue Type: Bug > Components: performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch > > > I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate > workload even when HTrace is disabled. This is because it stringifies several > integers. We should avoid all allocation and stringification when htrace is > disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13702) HTrace hooks taking 10-15% CPU in DFS client when disabled
[ https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524385#comment-16524385 ] Sean Busbey commented on HDFS-13702: Why do you think it didn't apply the latest patch? the reported URL is the latest one AFAICT. > HTrace hooks taking 10-15% CPU in DFS client when disabled > -- > > Key: HDFS-13702 > URL: https://issues.apache.org/jira/browse/HDFS-13702 > Project: Hadoop HDFS > Issue Type: Bug > Components: performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch > > > I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate > workload even when HTrace is disabled. This is because it stringifies several > integers. We should avoid all allocation and stringification when htrace is > disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13702) HTrace hooks taking 10-15% CPU in DFS client when disabled
[ https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524270#comment-16524270 ] Sean Busbey commented on HDFS-13702: {quote} I think we should commit this patch, +1, and then we can file another to review how to move forward with tracing in light of recent developments in htrace project; i.e. purge all other htrace references, look into alternatives, etc. {quote} please be sure to link the follow-on jira to this one. probably should get a DISCUSS thread on common-dev@ > HTrace hooks taking 10-15% CPU in DFS client when disabled > -- > > Key: HDFS-13702 > URL: https://issues.apache.org/jira/browse/HDFS-13702 > Project: Hadoop HDFS > Issue Type: Bug > Components: performance >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch > > > I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate > workload even when HTrace is disabled. This is because it stringifies several > integers. We should avoid all allocation and stringification when htrace is > disabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10183) Prevent race condition during class initialization
[ https://issues.apache.org/jira/browse/HDFS-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446591#comment-16446591 ] Sean Busbey commented on HDFS-10183: I'm still +1. I'd prefer it go in any branch that's still having releases and it applies to. I'm willing to help with backports if it's a limiting factor. > Prevent race condition during class initialization > -- > > Key: HDFS-10183 > URL: https://issues.apache.org/jira/browse/HDFS-10183 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.9.0 >Reporter: Pavel Avgustinov >Assignee: Pavel Avgustinov >Priority: Minor > Attachments: HADOOP-12944.1.patch, HDFS-10183.2.patch > > > In HADOOP-11969, [~busbey] tracked down a non-deterministic > {{NullPointerException}} to an oddity in the Java memory model: When multiple > threads trigger the loading of a class at the same time, one of them wins and > creates the {{java.lang.Class}} instance; the others block during this > initialization, but once it is complete they may obtain a reference to the > {{Class}} which has non-{{final}} fields still containing their default (i.e. > {{null}}) values. This leads to runtime failures that are hard to debug or > diagnose. > HADOOP-11969 observed that {{ThreadLocal}} fields, by their very nature, are > very likely to be accessed from multiple threads, and thus the problem is > particularly severe there. Consequently, the patch removed all occurrences of > the issue in the code base. > Unfortunately, since then HDFS-7964 has [reverted one of the fixes during a > refactoring|https://github.com/apache/hadoop/commit/2151716832ad14932dd65b1a4e47e64d8d6cd767#diff-0c2e9f7f9e685f38d1a11373b627cfa6R151], > and introduced a [new instance of the > problem|https://github.com/apache/hadoop/commit/2151716832ad14932dd65b1a4e47e64d8d6cd767#diff-6334d0df7d9aefbccd12b21bb7603169R43]. > The attached patch addresses the issue by adding the missing {{final}} > modifier in these two cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12711) deadly hdfs test
[ https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256317#comment-16256317 ] Sean Busbey commented on HDFS-12711: Personally, I think we can rely on committers to examine the output and disregard license violation notifications on dumpfiles. However, if we want to remove the false positive we could update [the current list of RAT plugin exclusions|https://github.com/apache/hadoop/blob/trunk/pom.xml#L377] it'd be something like: {code} ... org.apache.rat apache-rat-plugin .gitattributes .gitignore .git/** .idea/** **/build/** **/patchprocess/** **/*.js **/hs_err_pid*.log ... {code} (as an aside excluding all javascript files seems unwisely broad, especially given the substantial size of the YARN UI module at this point.) > deadly hdfs test > > > Key: HDFS-12711 > URL: https://issues.apache.org/jira/browse/HDFS-12711 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: 2.9.0, 2.8.2 >Reporter: Allen Wittenauer >Priority: Critical > Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12808) Add LOG.isDebugEnabled() guard for LOG.debug("...")
[ https://issues.apache.org/jira/browse/HDFS-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250354#comment-16250354 ] Sean Busbey commented on HDFS-12808: we've been slowly moving module by module over to slf4j. Agreed that time is better spent working towards that goal for any modules that contain unguarded string concats. > Add LOG.isDebugEnabled() guard for LOG.debug("...") > --- > > Key: HDFS-12808 > URL: https://issues.apache.org/jira/browse/HDFS-12808 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Mehran Hassani >Assignee: Bharat Viswanadham >Priority: Minor > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. In this > file, there is a debug level logging statement containing multiple string > concatenation without the if statement before them: > hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java, > LOG.debug("got fadvise(offset=" + offset + ", len=" + len +",flags=" + flags > + ")");, 82 > Would you be interested in adding the if, to the logging statement? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12711) deadly hdfs test
[ https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225562#comment-16225562 ] Sean Busbey commented on HDFS-12711: interesting. what's the spread on surefire versions for hadoops? If we are triggering things in HBase those builds all use a surefire version that includes that fix, with the exception of hbase branch-1.1. > deadly hdfs test > > > Key: HDFS-12711 > URL: https://issues.apache.org/jira/browse/HDFS-12711 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: 2.9.0, 2.8.2 >Reporter: Allen Wittenauer >Priority: Critical > Attachments: HDFS-12711.branch-2.00.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-12715) I didn't find the "-find" command on hadoop2.6
[ https://issues.apache.org/jira/browse/HDFS-12715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey resolved HDFS-12715. Resolution: Invalid Fix Version/s: (was: 2.6.0) Please use the user@hadoop mailing list for these kinds of questions: https://lists.apache.org/list.html?u...@hadoop.apache.org > I didn't find the "-find" command on hadoop2.6 > -- > > Key: HDFS-12715 > URL: https://issues.apache.org/jira/browse/HDFS-12715 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Affects Versions: 2.6.0 >Reporter: Djc >Priority: Critical > > When I looked for files on HDFS, I found no "find" command. I didn't find the > "find" command by using "Hadoop FS -help". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12711) deadly hdfs test
[ https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-12711: --- Affects Version/s: 2.8.2 2.9.0 > deadly hdfs test > > > Key: HDFS-12711 > URL: https://issues.apache.org/jira/browse/HDFS-12711 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: 2.9.0, 2.8.2 >Reporter: Allen Wittenauer > Attachments: HDFS-12711.branch-2.00.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12711) deadly hdfs test
[ https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-12711: --- Priority: Critical (was: Major) > deadly hdfs test > > > Key: HDFS-12711 > URL: https://issues.apache.org/jira/browse/HDFS-12711 > Project: Hadoop HDFS > Issue Type: Test >Affects Versions: 2.9.0, 2.8.2 >Reporter: Allen Wittenauer >Priority: Critical > Attachments: HDFS-12711.branch-2.00.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12599) Remove Mockito dependency from DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201068#comment-16201068 ] Sean Busbey commented on HDFS-12599: cherry-picked, ran through the altered tests, then pushed to branch-3.0. > Remove Mockito dependency from DataNodeTestUtils > > > Key: HDFS-12599 > URL: https://issues.apache.org/jira/browse/HDFS-12599 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Fix For: 3.0.0, 3.1.0 > > Attachments: HDFS-12599.v1.patch, HDFS-12599.v1.patch, > HDFS-12599.v1.patch > > > HDFS-11164 introduced {{DataNodeTestUtils.mockDatanodeBlkPinning}} which > brought dependency on mockito back into DataNodeTestUtils > Downstream, this resulted in: > {code} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12599) Remove Mockito dependency from DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-12599: --- Fix Version/s: 3.0.0 > Remove Mockito dependency from DataNodeTestUtils > > > Key: HDFS-12599 > URL: https://issues.apache.org/jira/browse/HDFS-12599 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Fix For: 3.0.0, 3.1.0 > > Attachments: HDFS-12599.v1.patch, HDFS-12599.v1.patch, > HDFS-12599.v1.patch > > > HDFS-11164 introduced {{DataNodeTestUtils.mockDatanodeBlkPinning}} which > brought dependency on mockito back into DataNodeTestUtils > Downstream, this resulted in: > {code} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12599) Remove Mockito dependency from DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-12599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200815#comment-16200815 ] Sean Busbey commented on HDFS-12599: any objections to including this in branch-3.0? > Remove Mockito dependency from DataNodeTestUtils > > > Key: HDFS-12599 > URL: https://issues.apache.org/jira/browse/HDFS-12599 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > Fix For: 3.1.0 > > Attachments: HDFS-12599.v1.patch, HDFS-12599.v1.patch, > HDFS-12599.v1.patch > > > HDFS-11164 introduced {{DataNodeTestUtils.mockDatanodeBlkPinning}} which > brought dependency on mockito back into DataNodeTestUtils > Downstream, this resulted in: > {code} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses
[ https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156045#comment-16156045 ] Sean Busbey commented on HDFS-12384: that's flagging an issue that exists in the branch before the patch: {code} -1 compile 1m 53s root in HDFS-10467 failed. {code} It properly shows that with your patch it passes: {code} +1 compile 14m 47s the patch passed {code} > Fixing compilation issue with BanDuplicateClasses > - > > Key: HDFS-12384 > URL: https://issues.apache.org/jira/browse/HDFS-12384 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-12384-HDFS-10467-000.patch, > HDFS-12384-HDFS-10467-001.patch, HDFS-12384-HDFS-10467-002.patch, > HDFS-12384-HDFS-10467-003.patch, HDFS-12384-HDFS-10467-004.patch > > > {{hadoop-client-modules}} is failing because of dependences added by > {{CuratorManager}}: > {code} > [INFO] Adding ignore: * > [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses > failed with message: > Duplicate classes found: > Found in: > > org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-SNAPSHOT:compile > org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAPSHOT:compile > Duplicate classes: > > org/apache/hadoop/shaded/org/apache/curator/framework/api/DeleteBuilder.class > > org/apache/hadoop/shaded/org/apache/curator/framework/CuratorFramework.class > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses
[ https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150081#comment-16150081 ] Sean Busbey commented on HDFS-12384: Ah. The correct solution is to update the exclusions that determine which shaded artifact the new classes go into; they should only go into one. If a downstream client needs the curator classes in order to interact with HDFS, then they should only be in the hadoop-client-runtime (which would mean updating the pom for hadoop-client-minicluster to exclude them). You can see an example of excluding all of the curator-client from hadoop-client-minicluster: https://github.com/apache/hadoop/blob/ce797a170669524224cfeaaf70647047e7626816/hadoop-client-modules/hadoop-client-minicluster/pom.xml#L137 If you need to just exclude some specific classes then take a look at the set of filters: https://github.com/apache/hadoop/blob/ce797a170669524224cfeaaf70647047e7626816/hadoop-client-modules/hadoop-client-minicluster/pom.xml#L603 > Fixing compilation issue with BanDuplicateClasses > - > > Key: HDFS-12384 > URL: https://issues.apache.org/jira/browse/HDFS-12384 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-12384-HDFS-10467-000.patch > > > {{hadoop-client-modules}} is failing because of dependences added by > {{CuratorManager}}: > {code} > [INFO] Adding ignore: * > [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses > failed with message: > Duplicate classes found: > Found in: > > org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-SNAPSHOT:compile > org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAPSHOT:compile > Duplicate classes: > > org/apache/hadoop/shaded/org/apache/curator/framework/api/DeleteBuilder.class > > org/apache/hadoop/shaded/org/apache/curator/framework/CuratorFramework.class > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses
[ https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149863#comment-16149863 ] Sean Busbey commented on HDFS-12384: Sorry, I'm on mobile so reading patch files is difficult. Is this proposing that we ignore all classes in the shaded minicluster module? > Fixing compilation issue with BanDuplicateClasses > - > > Key: HDFS-12384 > URL: https://issues.apache.org/jira/browse/HDFS-12384 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-12384-HDFS-10467-000.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12384) Fixing compilation issue with BanDuplicateClasses
[ https://issues.apache.org/jira/browse/HDFS-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149853#comment-16149853 ] Sean Busbey commented on HDFS-12384: Gimme time to gain context here. > Fixing compilation issue with BanDuplicateClasses > - > > Key: HDFS-12384 > URL: https://issues.apache.org/jira/browse/HDFS-12384 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Fix For: HDFS-10467 > > Attachments: HDFS-12384-HDFS-10467-000.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12263) Revise StreamCapacities doc to describe the API usage and the requirements for customized OutputStream implemetation
[ https://issues.apache.org/jira/browse/HDFS-12263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-12263: --- Target Version/s: 3.0.0-beta1 (was: 3.0.0-alpha4) > Revise StreamCapacities doc to describe the API usage and the requirements > for customized OutputStream implemetation > > > Key: HDFS-12263 > URL: https://issues.apache.org/jira/browse/HDFS-12263 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > > [~busbey] raised the concerns to call out what is the expected way to call > {{StreamCapabilities}} from the client side. And this doc should also > describe the rules for any {{FSOutputStream}} implementation to follow. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12263) Revise StreamCapacities doc to describe the API usage and the requirements for customized OutputStream implemetation
[ https://issues.apache.org/jira/browse/HDFS-12263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114936#comment-16114936 ] Sean Busbey commented on HDFS-12263: I think that's got it covered. Thanks [~eddyxu]! > Revise StreamCapacities doc to describe the API usage and the requirements > for customized OutputStream implemetation > > > Key: HDFS-12263 > URL: https://issues.apache.org/jira/browse/HDFS-12263 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > > [~busbey] raised the concerns to call out what is the expected way to call > {{StreamCapabilities}} from the client side. And this doc should also > describe the rules for any {{FSOutputStream}} implementation to follow. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12251) Add document for StreamCapabilities
[ https://issues.apache.org/jira/browse/HDFS-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114832#comment-16114832 ] Sean Busbey commented on HDFS-12251: Another thing missing from the docs as they are in v2: from Steve L on HDFS-11644: {quote} I don't want an enum though, I'd like a string. Why? Let's us and implementation classes add new methods without renegotiating the source tree. For example, on S3a, I could add the new attributes "s3a:delayed-commit", blobstore:put-on-close ( the latter being something which other blobstores could also do), etc. Some convention that for store-specific options, we use a prefix, so as to stop incompatibilities. {quote} The docs should call out that the options from store implementations should use a prefix. Preferably also state a suggested place implementors should doc the options they provide (based on whatever we do for documenting the stores in HADOOP-14402) > Add document for StreamCapabilities > --- > > Key: HDFS-12251 > URL: https://issues.apache.org/jira/browse/HDFS-12251 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HDFS-12251.00.patch, HDFS-12251.01.patch, > HDFS-12251.02.patch > > > Update filesystem docs to describe the purpose and usage of > {{StreamCapabilities}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12251) Add document for StreamCapabilities
[ https://issues.apache.org/jira/browse/HDFS-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114828#comment-16114828 ] Sean Busbey commented on HDFS-12251: {quote} bq. StreamCapabilities.StreamCapability isn't public I feel that it should be public. Will file a new JIRA for it. {quote} In the future, please atleast wait until your proposed JIRA actually lands. I'm not sure I agree with this assertion that it should be public. {quote} bq. Do all outputstreams returned from FileSystem implement StreamCapabilities? I think they are going to support it, as WIP in HADOOP-14402. {quote} Even if/when HADOOP-14402 completes, there's nothing that requires an outputstream to implement the interface right? so third party implementations might not do so. We should call this out in the docs. > Add document for StreamCapabilities > --- > > Key: HDFS-12251 > URL: https://issues.apache.org/jira/browse/HDFS-12251 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: HDFS-12251.00.patch, HDFS-12251.01.patch, > HDFS-12251.02.patch > > > Update filesystem docs to describe the purpose and usage of > {{StreamCapabilities}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12251) Add document for StreamCapabilities
[ https://issues.apache.org/jira/browse/HDFS-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113483#comment-16113483 ] Sean Busbey commented on HDFS-12251: Questions I have after reading hte current draft: Do all outputstreams returned from FileSystem implement StreamCapabilities? (I think no) Presuming no, what should I assume about an outputstream that gets returned to me that doesn't implement StreamCapabilities? (I think up to the application, but dataloss sensitive applications need to presume no operations actually work) > Add document for StreamCapabilities > --- > > Key: HDFS-12251 > URL: https://issues.apache.org/jira/browse/HDFS-12251 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12251.00.patch, HDFS-12251.01.patch, > HDFS-12251.02.patch > > > Update filesystem docs to describe the purpose and usage of > {{StreamCapabilities}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11644) Support for querying outputstream capabilities
[ https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113479#comment-16113479 ] Sean Busbey commented on HDFS-11644: {quote} In the meantime, I have a question for hbase. As StreamCapabilities is bind to an OutputStream, Hbase needs to firstly open a file for write (i.e., getting the output stream, before it can query the capabilities. Would this satisfy the needs from hbase side? {quote} Yeah that's fine I think. I'll come complain if implementing use of it makes me change my mind. ;) > Support for querying outputstream capabilities > -- > > Key: HDFS-11644 > URL: https://issues.apache.org/jira/browse/HDFS-11644 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Manoj Govindassamy > Labels: hdfs-ec-3.0-must-do > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: HDFS-11644.01.patch, HDFS-11644.02.patch, > HDFS-11644.03.patch, HDFS-11644-branch-2.01.patch > > > FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, > calls hsync. Otherwise, it just calls flush. This is used, for instance, by > YARN's FileSystemTimelineWriter. > DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. > However, DFSStripedOS throws a runtime exception when the Syncable methods > are called. > We should refactor the inheritance structure so DFSStripedOS does not > implement Syncable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12251) Add document for StreamCapabilities
[ https://issues.apache.org/jira/browse/HDFS-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113475#comment-16113475 ] Sean Busbey commented on HDFS-12251: Am I correctly deducing from these docs changes that there isn't an ability query about {{append}}? Should there be? > Add document for StreamCapabilities > --- > > Key: HDFS-12251 > URL: https://issues.apache.org/jira/browse/HDFS-12251 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12251.00.patch, HDFS-12251.01.patch, > HDFS-12251.02.patch > > > Update filesystem docs to describe the purpose and usage of > {{StreamCapabilities}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12251) Add document for StreamCapabilities
[ https://issues.apache.org/jira/browse/HDFS-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113472#comment-16113472 ] Sean Busbey commented on HDFS-12251: {code} 1233 * `StreamCapabilties.HFLUSH` ("*hflush*"): the capability to flush out the data 1234 in client's buffer. 1235 * `StreamCapabilities.HSYNC` ("*hsync*"): capability to flush out the data in 1236 client's buffer and the disk device. {code} StreamCapabilities.StreamCapability isn't public, we shouldn't refer to it in downstream facing documentation. Just list the strings. > Add document for StreamCapabilities > --- > > Key: HDFS-12251 > URL: https://issues.apache.org/jira/browse/HDFS-12251 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-12251.00.patch, HDFS-12251.01.patch, > HDFS-12251.02.patch > > > Update filesystem docs to describe the purpose and usage of > {{StreamCapabilities}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11644) Support for querying outputstream capabilities
[ https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110224#comment-16110224 ] Sean Busbey commented on HDFS-11644: We need docs on using this feature. I'd like to file a blocker on HBase 2.0 to make use of it where it is present, specifically so that we can avoid deploying the HBase WAL on filesystems that don't support hflush/hsync (such as EC). The lack of ability to reliably flush/sync doesn't appear to be called out in the Erasure Coding docs as a concern for those who might deploy it and I'd rather HBase bring the risk to operators' attention by failing fast when it finds the configuration. But I can't find any guidance on using this feature (or that it exists) in the 3.0.0-alpha4 release notes nor in the filesystem docs. I also can't tell from the javadocs on StreamCapabilities what I'm even supposed to query for to check on the fundamentals that this jira is addressing (hflush / hsync). > Support for querying outputstream capabilities > -- > > Key: HDFS-11644 > URL: https://issues.apache.org/jira/browse/HDFS-11644 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Manoj Govindassamy > Labels: hdfs-ec-3.0-must-do > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: HDFS-11644.01.patch, HDFS-11644.02.patch, > HDFS-11644.03.patch, HDFS-11644-branch-2.01.patch > > > FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, > calls hsync. Otherwise, it just calls flush. This is used, for instance, by > YARN's FileSystemTimelineWriter. > DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. > However, DFSStripedOS throws a runtime exception when the Syncable methods > are called. > We should refactor the inheritance structure so DFSStripedOS does not > implement Syncable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11538) Move ConfiguredFailoverProxyProvider into hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949925#comment-15949925 ] Sean Busbey commented on HDFS-11538: Please do the test moves now as well. If we wait we're likely to have a substantial amount of time where no tests run against changes (based on the prior hadoop-hdfs-client changes) > Move ConfiguredFailoverProxyProvider into hadoop-hdfs-client > > > Key: HDFS-11538 > URL: https://issues.apache.org/jira/browse/HDFS-11538 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Huafeng Wang >Priority: Blocker > Attachments: HDFS-11538.001.patch > > > Follow-up for HDFS-11431. We should move this missing class over rather than > pulling in the whole hadoop-hdfs dependency. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11588) Output Avro format in the offline editlog viewer
[ https://issues.apache.org/jira/browse/HDFS-11588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-11588: --- Component/s: tools > Output Avro format in the offline editlog viewer > > > Key: HDFS-11588 > URL: https://issues.apache.org/jira/browse/HDFS-11588 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Haohui Mai >Assignee: Haohui Mai > > We found that it is handy to import the edit logs into query engines (e.g., > Hive / Presto) to understand the usages of the cluster. Some examples include: > * The size of the data and the number of files that are written into a > directory > * The distribution of the operations, for different directories. > * The number of files that are created by a user. > The answers to the above questions give insights on the usages of the > clusters and have significant values on capacity planning. > Importing the edit log into query engines simplifies the tasks of answering > these questions, and they can be answered efficiently. > While the Offline Editlog Viewer (OEV) supports outputting editlogs in XML > formats, we found that it is time-consuming to transforming the XML format to > formats that query engines recognize, because the generating the editlogs in > XML formats and transforming them into formats that the query engine > understands takes significant amount of time. In our environment it takes > minutes to prepare a 100MB editlog file into a corresponding Parquet file. > This jira proposes to extend the OEV to output Avro files to make this > process efficient. As an internal tool, the Avro output format has certain > pre-defined schemas but it does not have the constraint of maintaining > backward compatibility of the output, which is similar to the XML output > format. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11588) Output Avro format in the offline editlog viewer
[ https://issues.apache.org/jira/browse/HDFS-11588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-11588: --- Issue Type: New Feature (was: Bug) > Output Avro format in the offline editlog viewer > > > Key: HDFS-11588 > URL: https://issues.apache.org/jira/browse/HDFS-11588 > Project: Hadoop HDFS > Issue Type: New Feature > Components: tools >Reporter: Haohui Mai >Assignee: Haohui Mai > > We found that it is handy to import the edit logs into query engines (e.g., > Hive / Presto) to understand the usages of the cluster. Some examples include: > * The size of the data and the number of files that are written into a > directory > * The distribution of the operations, for different directories. > * The number of files that are created by a user. > The answers to the above questions give insights on the usages of the > clusters and have significant values on capacity planning. > Importing the edit log into query engines simplifies the tasks of answering > these questions, and they can be answered efficiently. > While the Offline Editlog Viewer (OEV) supports outputting editlogs in XML > formats, we found that it is time-consuming to transforming the XML format to > formats that query engines recognize, because the generating the editlogs in > XML formats and transforming them into formats that the query engine > understands takes significant amount of time. In our environment it takes > minutes to prepare a 100MB editlog file into a corresponding Parquet file. > This jira proposes to extend the OEV to output Avro files to make this > process efficient. As an internal tool, the Avro output format has certain > pre-defined schemas but it does not have the constraint of maintaining > backward compatibility of the output, which is similar to the XML output > format. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11496) Ozone: Merge with trunk needed a ignore duplicate entry in pom file due to shading
[ https://issues.apache.org/jira/browse/HDFS-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898103#comment-15898103 ] Sean Busbey commented on HDFS-11496: I would guess that the Ozone branch added netty as a dependency for something that is transitively included in minicluster. using maven's dependency tree should show you where the dependency comes in. > Ozone: Merge with trunk needed a ignore duplicate entry in pom file due to > shading > -- > > Key: HDFS-11496 > URL: https://issues.apache.org/jira/browse/HDFS-11496 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer > > The trunk merge needed a hack in the > hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml to ignore > netty files. Not sure that is the right thing to do, so tracking here if we > need to revert this change before we merge. > {code} > ~/a/hadoop> git diff > hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml > ─ > modified: hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml > ─ > @ pom.xml:106 @ > * > > > > > > > > > > > io.netty > netty > * > > > > > > {code} > [~andrew.wang] or [~busbey] Would one of you care to comment if this is the > right thing to do ? Thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11496) Ozone: Merge with trunk needed a ignore duplicate entry in pom file due to shading
[ https://issues.apache.org/jira/browse/HDFS-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896391#comment-15896391 ] Sean Busbey commented on HDFS-11496: Changing the invariants is probably the wrong way to go. A given set of classes is supposed to be in exactly one of the three shaded artifacts. My guess would be that netty should be in the runtime artifact. That means you should exclude its inclusion in the minicluster artifact. > Ozone: Merge with trunk needed a ignore duplicate entry in pom file due to > shading > -- > > Key: HDFS-11496 > URL: https://issues.apache.org/jira/browse/HDFS-11496 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer > > The trunk merge needed a hack in the > hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml to ignore > netty files. Not sure that is the right thing to do, so tracking here if we > need to revert this change before we merge. > {code} > ~/a/hadoop> git diff > hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml > ─ > modified: hadoop-client-modules/hadoop-client-check-test-invariants/pom.xml > ─ > @ pom.xml:106 @ > * > > > > > > > > > > > io.netty > netty > * > > > > > > {code} > [~andrew.wang] or [~busbey] Would one of you care to comment if this is the > right thing to do ? Thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11379) DFSInputStream may infinite loop requesting block locations
[ https://issues.apache.org/jira/browse/HDFS-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863930#comment-15863930 ] Sean Busbey commented on HDFS-11379: could we get this into a 2.7 release? (maybe a 2.6 if 2.6 is similarly impacted?) > DFSInputStream may infinite loop requesting block locations > --- > > Key: HDFS-11379 > URL: https://issues.apache.org/jira/browse/HDFS-11379 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Fix For: 2.8.0, 3.0.0-alpha3 > > Attachments: HDFS-11379.branch-2.patch, HDFS-11379.trunk.patch > > > DFSInputStream creation caches file size and initial range of locations. If > the file is truncated (or replaced) and the client attempts to read outside > the initial range, the client goes into a tight infinite looping requesting > locations for the nonexistent range. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11379) DFSInputStream may infinite loop requesting block locations
[ https://issues.apache.org/jira/browse/HDFS-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-11379: --- Target Version/s: 2.9.0, 3.0.0-alpha3, 2.8.1 (was: 2.8.1) > DFSInputStream may infinite loop requesting block locations > --- > > Key: HDFS-11379 > URL: https://issues.apache.org/jira/browse/HDFS-11379 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-11379.branch-2.patch, HDFS-11379.trunk.patch > > > DFSInputStream creation caches file size and initial range of locations. If > the file is truncated (or replaced) and the client attempts to read outside > the initial range, the client goes into a tight infinite looping requesting > locations for the nonexistent range. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10862) Typos in 7 log messages
[ https://issues.apache.org/jira/browse/HDFS-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15487513#comment-15487513 ] Sean Busbey commented on HDFS-10862: good finds! interested in coming up with a patch for fixing these? [Our contributor guide|http://wiki.apache.org/hadoop/HowToContribute] covers everything needed to work across the code base, but I'd imagine for these things a simple manual test of compilation before and after the changes should suffice. > Typos in 7 log messages > --- > > Key: HDFS-10862 > URL: https://issues.apache.org/jira/browse/HDFS-10862 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Mehran Hassani >Priority: Trivial > Labels: newbie > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. Typos in log > messages are one of the reoccurring bugs. Therefore, I made a tool find typos > in log statements. During my experiments, I managed to find the following > typos in Hadoop HDFS: > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java, > LOG.info((success ? "S" : "Uns") +"uccessfully sent block report 0x" > +Long.toHexString(reportId) + " containing " + reports.length +" storage > report(s) of which we sent " + numReportsSent + "." +" The reports had " + > totalBlockCount +" total blocks and used " + numRPCs +" RPC(s). This took " + > brCreateCost +" msec to generate and " + brSendCost +" msecs for RPC and NN > processing." +" Got back " +((nCmds == 0) ? "no commands" :((nCmds == 1) ? > "one command: " + cmds.get(0) :(nCmds + " commands: " + Joiner.on("; > ").join(cmds +"."), > uccessfullysuccessfully > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java, > LOG.info("Balancing bandwith is " + bandwidth + " bytes/s"), > bandwith should be bandwidth > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java, > FsDatasetImpl.LOG.info("The volume " + v + " is closed while " +"addng > replicas ignored."), > addng should be adding > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CancelDelegationTokenServlet.java, > LOG.info("Exception while cancelling token. Re-throwing. " e), > cancelling should be canceling > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java, > NameNode.LOG.info("Caching file names occuring more than " + threshold+ " > times"), > occuring should be occurring > In file > /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java, > LOG.info("NNStorage.attemptRestoreRemovedStorage: check removed(failed) > "+"storarge. removedStorages size = " + removedStorageDirs.size()), > storarge should be storage > In file > /hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java, > LOG.info("Partical read. Asked offset: " + offset + " count: " + count+ " > and read back: " + readCount + " file size: "+ attrs.getSize()), > Partical should be Partial -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-10767) downgrade from 2.7.2 to 2.5.0
[ https://issues.apache.org/jira/browse/HDFS-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey resolved HDFS-10767. Resolution: Invalid > downgrade from 2.7.2 to 2.5.0 > - > > Key: HDFS-10767 > URL: https://issues.apache.org/jira/browse/HDFS-10767 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 > Environment: hdfs 2.5.0 2.7.2 >Reporter: jin xing > > I have already upgrade my cluster’s namenodes(with one stand by for HA) and > several datanodes from 2.5.0 folloing > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#Downgrade_and_Rollback; > > I take following steps: > 1. hdfs dfsadmin -rollingUpgrade prepare; > 2. hdfs dfsadmin -rollingUpgrade query; > 3. hdfs dfsadmin -shutdownDatanode upgrade > 4. restart and upgrade datanode; > However, I terminated the upgrade by mistake with command "hfs dfsadmin > -rollingUpgrade finalize" > Currently, I have two 2.7.2 nematodes, and three 2.7.2 datanodes and 63 2.5.0 > datanodes; Now I want to downgrade the nematodes and datanodes from 2.7.2 > back to 2.5.0; > But when I try to downgrade nematode and restart with “-rollingUpgrade > downgrade”, namenode cannot get started, I get rolling exception: > 2016-08-16 20:37:08,642 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected > version of storage directory /home/maintain/hadoop/data/hdfs-namenode. > Reported: -63. Expecting = -57. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131) > at > org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:608) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:751) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:735) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473) > 2016-08-16 20:37:08,645 INFO org.mortbay.log: Stopped > HttpServer2$SelectChannelConnectorWithSafeStartup@dx-pipe-sata61-pm:50070 > 2016-08-16 20:37:08,745 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics > system... > 2016-08-16 20:37:08,746 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > stopped. > 2016-08-16 20:37:08,746 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > shutdown complete. > 2016-08-16 20:37:08,746 FATAL > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected > version of storage directory /home/maintain/hadoop/data/hdfs-namenode. > Reported: -63. Expecting = -57. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131) > at > org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:608) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585) > at > org.apache.hadoop.hdfs.server.namenode
[jira] [Commented] (HDFS-10767) downgrade from 2.7.2 to 2.5.0
[ https://issues.apache.org/jira/browse/HDFS-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423167#comment-15423167 ] Sean Busbey commented on HDFS-10767: This was also posted to user@hbase, let's keep the discussion there. ([ref the discussion thread|https://lists.apache.org/thread.html/11e53ba87bf03db348bbaff9f24065fb262839417a3ec06944d6b92b@%3Cuser.hadoop.apache.org%3E]) > downgrade from 2.7.2 to 2.5.0 > - > > Key: HDFS-10767 > URL: https://issues.apache.org/jira/browse/HDFS-10767 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.5.0 > Environment: hdfs 2.5.0 2.7.2 >Reporter: jin xing > > I have already upgrade my cluster’s namenodes(with one stand by for HA) and > several datanodes from 2.5.0 folloing > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#Downgrade_and_Rollback; > > I take following steps: > 1. hdfs dfsadmin -rollingUpgrade prepare; > 2. hdfs dfsadmin -rollingUpgrade query; > 3. hdfs dfsadmin -shutdownDatanode upgrade > 4. restart and upgrade datanode; > However, I terminated the upgrade by mistake with command "hfs dfsadmin > -rollingUpgrade finalize" > Currently, I have two 2.7.2 nematodes, and three 2.7.2 datanodes and 63 2.5.0 > datanodes; Now I want to downgrade the nematodes and datanodes from 2.7.2 > back to 2.5.0; > But when I try to downgrade nematode and restart with “-rollingUpgrade > downgrade”, namenode cannot get started, I get rolling exception: > 2016-08-16 20:37:08,642 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception > loading fsimage > org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected > version of storage directory /home/maintain/hadoop/data/hdfs-namenode. > Reported: -63. Expecting = -57. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131) > at > org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:608) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:700) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:529) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:585) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:751) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:735) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1407) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473) > 2016-08-16 20:37:08,645 INFO org.mortbay.log: Stopped > HttpServer2$SelectChannelConnectorWithSafeStartup@dx-pipe-sata61-pm:50070 > 2016-08-16 20:37:08,745 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics > system... > 2016-08-16 20:37:08,746 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > stopped. > 2016-08-16 20:37:08,746 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > shutdown complete. > 2016-08-16 20:37:08,746 FATAL > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join > org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected > version of storage directory /home/maintain/hadoop/data/hdfs-namenode. > Reported: -63. Expecting = -57. > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131) > at > org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:608) > at > org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:323) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:955) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.
[jira] [Commented] (HDFS-10758) ReconfigurableBase can log sensitive information
[ https://issues.apache.org/jira/browse/HDFS-10758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418982#comment-15418982 ] Sean Busbey commented on HDFS-10758: bq. I think a generic mechanism for redacting sensitive information for textual display will be useful to some of the web UIs too. Should this be in the Hadoop Common tracker so that the solution can be leveraged by both HDFS and YARN? > ReconfigurableBase can log sensitive information > > > Key: HDFS-10758 > URL: https://issues.apache.org/jira/browse/HDFS-10758 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > ReconfigurableBase will log old and new configuration values, which may cause > sensitive parameters (most notably cloud storage keys, though there may be > other instances) to get included in the logs. > Given the currently small list of reconfigurable properties, an argument > could be made for simply not logging the property values at all, but this is > not the only instance where potentially sensitive configuration gets written > somewhere else in plaintext. I think a generic mechanism for redacting > sensitive information for textual display will be useful to some of the web > UIs too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10707) Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets
[ https://issues.apache.org/jira/browse/HDFS-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400967#comment-15400967 ] Sean Busbey commented on HDFS-10707: I agree that the failure looks unrelated. What branches are we targeting? branch-2.8, branch-2, and trunk? > Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets > - > > Key: HDFS-10707 > URL: https://issues.apache.org/jira/browse/HDFS-10707 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Vincent Poon >Priority: Minor > Attachments: HDFS-10707.2.patch, HDFS-10707.patch > > > org.apache.commons.io.Charsets is deprecated in favor of > java.nio.charset.StandardCharsets -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10601) Improve log message to include hostname when the NameNode is in safemode
[ https://issues.apache.org/jira/browse/HDFS-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378418#comment-15378418 ] Sean Busbey commented on HDFS-10601: +1 (non-binding) > Improve log message to include hostname when the NameNode is in safemode > > > Key: HDFS-10601 > URL: https://issues.apache.org/jira/browse/HDFS-10601 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Minor > Attachments: HDFS-10601.001.patch, HDFS-10601.002.patch > > > When remote NN operations are involved, it would be nice to have the Namenode > hostname in safemode notification log. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4
[ https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225522#comment-15225522 ] Sean Busbey commented on HDFS-8791: --- Downgrade breakage only in minor versions please > block ID-based DN storage layout can be very slow for datanode on ext4 > -- > > Key: HDFS-8791 > URL: https://issues.apache.org/jira/browse/HDFS-8791 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0, 2.8.0, 2.7.1 >Reporter: Nathan Roberts >Assignee: Chris Trezzo >Priority: Blocker > Fix For: 2.7.3 > > Attachments: 32x32DatanodeLayoutTesting-v1.pdf, > 32x32DatanodeLayoutTesting-v2.pdf, HDFS-8791-trunk-v1.patch, > HDFS-8791-trunk-v2-bin.patch, HDFS-8791-trunk-v2.patch, > HDFS-8791-trunk-v2.patch, HDFS-8791-trunk-v3-bin.patch, > hadoop-56-layout-datanode-dir.tgz, test-node-upgrade.txt > > > We are seeing cases where the new directory layout causes the datanode to > basically cause the disks to seek for 10s of minutes. This can be when the > datanode is running du, and it can also be when it is performing a > checkDirs(). Both of these operations currently scan all directories in the > block pool and that's very expensive in the new layout. > The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K > leaf directories where block files are placed. > So, what we have on disk is: > - 256 inodes for the first level directories > - 256 directory blocks for the first level directories > - 256*256 inodes for the second level directories > - 256*256 directory blocks for the second level directories > - Then the inodes and blocks to store the the HDFS blocks themselves. > The main problem is the 256*256 directory blocks. > inodes and dentries will be cached by linux and one can configure how likely > the system is to prune those entries (vfs_cache_pressure). However, ext4 > relies on the buffer cache to cache the directory blocks and I'm not aware of > any way to tell linux to favor buffer cache pages (even if it did I'm not > sure I would want it to in general). > Also, ext4 tries hard to spread directories evenly across the entire volume, > this basically means the 64K directory blocks are probably randomly spread > across the entire disk. A du type scan will look at directories one at a > time, so the ioscheduler can't optimize the corresponding seeks, meaning the > seeks will be random and far. > In a system I was using to diagnose this, I had 60K blocks. A DU when things > are hot is less than 1 second. When things are cold, about 20 minutes. > How do things get cold? > - A large set of tasks run on the node. This pushes almost all of the buffer > cache out, causing the next DU to hit this situation. We are seeing cases > where a large job can cause a seek storm across the entire cluster. > Why didn't the previous layout see this? > - It might have but it wasn't nearly as pronounced. The previous layout would > be a few hundred directory blocks. Even when completely cold, these would > only take a few a hundred seeks which would mean single digit seconds. > - With only a few hundred directories, the odds of the directory blocks > getting modified is quite high, this keeps those blocks hot and much less > likely to be evicted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4
[ https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219899#comment-15219899 ] Sean Busbey commented on HDFS-8791: --- Please expand the release note to state that should something go wrong, rolling downgrade will not be possible and a rollback (which requires downtime) will be needed. Has someone tested that rollback works after upgrading to a version with this patch? I saw someone manually examined a non-finalized {{previous}} directory, but I'm wondering if someone walked through the entire process. > block ID-based DN storage layout can be very slow for datanode on ext4 > -- > > Key: HDFS-8791 > URL: https://issues.apache.org/jira/browse/HDFS-8791 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.0, 2.8.0, 2.7.1 >Reporter: Nathan Roberts >Assignee: Chris Trezzo >Priority: Blocker > Fix For: 2.7.3 > > Attachments: 32x32DatanodeLayoutTesting-v1.pdf, > 32x32DatanodeLayoutTesting-v2.pdf, HDFS-8791-trunk-v1.patch, > HDFS-8791-trunk-v2-bin.patch, HDFS-8791-trunk-v2.patch, > HDFS-8791-trunk-v2.patch, HDFS-8791-trunk-v3-bin.patch, > hadoop-56-layout-datanode-dir.tgz, test-node-upgrade.txt > > > We are seeing cases where the new directory layout causes the datanode to > basically cause the disks to seek for 10s of minutes. This can be when the > datanode is running du, and it can also be when it is performing a > checkDirs(). Both of these operations currently scan all directories in the > block pool and that's very expensive in the new layout. > The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K > leaf directories where block files are placed. > So, what we have on disk is: > - 256 inodes for the first level directories > - 256 directory blocks for the first level directories > - 256*256 inodes for the second level directories > - 256*256 directory blocks for the second level directories > - Then the inodes and blocks to store the the HDFS blocks themselves. > The main problem is the 256*256 directory blocks. > inodes and dentries will be cached by linux and one can configure how likely > the system is to prune those entries (vfs_cache_pressure). However, ext4 > relies on the buffer cache to cache the directory blocks and I'm not aware of > any way to tell linux to favor buffer cache pages (even if it did I'm not > sure I would want it to in general). > Also, ext4 tries hard to spread directories evenly across the entire volume, > this basically means the 64K directory blocks are probably randomly spread > across the entire disk. A du type scan will look at directories one at a > time, so the ioscheduler can't optimize the corresponding seeks, meaning the > seeks will be random and far. > In a system I was using to diagnose this, I had 60K blocks. A DU when things > are hot is less than 1 second. When things are cold, about 20 minutes. > How do things get cold? > - A large set of tasks run on the node. This pushes almost all of the buffer > cache out, causing the next DU to hit this situation. We are seeing cases > where a large job can cause a seek storm across the entire cluster. > Why didn't the previous layout see this? > - It might have but it wasn't nearly as pronounced. The previous layout would > be a few hundred directory blocks. Even when completely cold, these would > only take a few a hundred seeks which would mean single digit seconds. > - With only a few hundred directories, the odds of the directory blocks > getting modified is quite high, this keeps those blocks hot and much less > likely to be evicted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10230) HDFS Native Client build failed
[ https://issues.apache.org/jira/browse/HDFS-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216256#comment-15216256 ] Sean Busbey commented on HDFS-10230: this is a duplicated of HADOOP-12692. > HDFS Native Client build failed > --- > > Key: HDFS-10230 > URL: https://issues.apache.org/jira/browse/HDFS-10230 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Reporter: John Zhuge >Priority: Blocker > > HDFS Native Client build failed: > https://builds.apache.org/job/Hadoop-trunk-Commit/9514/console > {code} > [INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ > hadoop-hdfs-native-client --- > Downloading: > https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom > 4/21 KB > 8/21 KB > 8/21 KB > 12/21 KB > 14/21 KB > 18/21 KB > 21/21 KB > > Downloaded: > https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/3.0.0-SNAPSHOT/hadoop-hdfs-3.0.0-20160328.214654-6500.pom > (21 KB at 193.9 KB/sec) > [WARNING] > Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT > paths to dependency are: > +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT > +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT > and > +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT > +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500 > [WARNING] Rule 0: org.apache.maven.plugins.enforcer.DependencyConvergence > failed with message: > Failed while enforcing releasability the error(s) are [ > Dependency convergence error for org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT > paths to dependency are: > +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT > +-org.apache.hadoop:hadoop-hdfs:3.0.0-SNAPSHOT > and > +-org.apache.hadoop:hadoop-hdfs-native-client:3.0.0-SNAPSHOT > +-org.apache.hadoop:hadoop-hdfs:3.0.0-20160328.214654-6500 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7608) hdfs dfsclient newConnectedPeer has no write timeout
[ https://issues.apache.org/jira/browse/HDFS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216249#comment-15216249 ] Sean Busbey commented on HDFS-7608: --- can we get this backported to 2.6 and 2.7 please? > hdfs dfsclient newConnectedPeer has no write timeout > - > > Key: HDFS-7608 > URL: https://issues.apache.org/jira/browse/HDFS-7608 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, hdfs-client >Affects Versions: 2.3.0, 2.6.0 > Environment: hdfs 2.3.0 hbase 0.98.6 >Reporter: zhangshilong >Assignee: Xiaoyu Yao > Fix For: 2.8.0 > > Attachments: HDFS-7608.0.patch, HDFS-7608.1.patch, HDFS-7608.2.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > problem: > hbase compactSplitThread may lock forever on read datanode blocks. > debug found: epollwait timeout set to 0,so epollwait can not run out. > cause: in hdfs 2.3.0 > hbase using DFSClient to read and write blocks. > DFSClient creates one socket using newConnectedPeer(addr), but has no read > or write timeout. > in v 2.6.0, newConnectedPeer has added readTimeout to deal with the > problem,but did not add writeTimeout. why did not add write Timeout? > I think NioInetPeer need a default socket timeout,so appalications will no > need to force adding timeout by themselives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-10223: --- Target Version/s: 2.8.0, 2.7.3, 2.6.5 (was: 2.8.0, 2.7.3) > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch, HDFS-10223.002.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-10223: --- Target Version/s: 2.8.0, 2.7.3 (was: 2.8.0) > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch, HDFS-10223.002.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214867#comment-15214867 ] Sean Busbey commented on HDFS-10223: +1 non-binding, pending non-surprising buildbot feedback > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10183) Prevent race condition during class initialization
[ https://issues.apache.org/jira/browse/HDFS-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201809#comment-15201809 ] Sean Busbey commented on HDFS-10183: +1 LGTM, presuming QABot has no complaints. I did a search through the code and I don't see any other instances currently. > Prevent race condition during class initialization > -- > > Key: HDFS-10183 > URL: https://issues.apache.org/jira/browse/HDFS-10183 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.9.0 >Reporter: Pavel Avgustinov >Priority: Minor > Fix For: 2.9.0 > > Attachments: HADOOP-12944.1.patch > > > In HADOOP-11969, [~busbey] tracked down a non-deterministic > {{NullPointerException}} to an oddity in the Java memory model: When multiple > threads trigger the loading of a class at the same time, one of them wins and > creates the {{java.lang.Class}} instance; the others block during this > initialization, but once it is complete they may obtain a reference to the > {{Class}} which has non-{{final}} fields still containing their default (i.e. > {{null}}) values. This leads to runtime failures that are hard to debug or > diagnose. > HADOOP-11969 observed that {{ThreadLocal}} fields, by their very nature, are > very likely to be accessed from multiple threads, and thus the problem is > particularly severe there. Consequently, the patch removed all occurrences of > the issue in the code base. > Unfortunately, since then HDFS-7964 has [reverted one of the fixes during a > refactoring|https://github.com/apache/hadoop/commit/2151716832ad14932dd65b1a4e47e64d8d6cd767#diff-0c2e9f7f9e685f38d1a11373b627cfa6R151], > and introduced a [new instance of the > problem|https://github.com/apache/hadoop/commit/2151716832ad14932dd65b1a4e47e64d8d6cd767#diff-6334d0df7d9aefbccd12b21bb7603169R43]. > The attached patch addresses the issue by adding the missing {{final}} > modifier in these two cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698064#comment-14698064 ] Sean Busbey commented on HDFS-8344: --- the issue where the other reports have been showing up is HDFS-8406. I believe in several cases we're doing burn in tests against cluster deployments, so if the config isn't something we'd have people run with we ought not do it. IIRC, the lease failures were over pretty extended periods of time. > 15 minutes for the cases where it caused my HBase failures. > NameNode doesn't recover lease for files with missing blocks > > > Key: HDFS-8344 > URL: https://issues.apache.org/jira/browse/HDFS-8344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Fix For: 2.8.0 > > Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, > HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, > HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch, HDFS-8344.09.patch > > > I found another\(?) instance in which the lease is not recovered. This is > reproducible easily on a pseudo-distributed single node cluster > # Before you start it helps if you set. This is not necessary, but simply > reduces how long you have to wait > {code} > public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; > public static final long LEASE_HARDLIMIT_PERIOD = 2 * > LEASE_SOFTLIMIT_PERIOD; > {code} > # Client starts to write a file. (could be less than 1 block, but it hflushed > so some of the data has landed on the datanodes) (I'm copying the client code > I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) > # Client crashes. (I simulate this by kill -9 the $(hadoop jar > TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter" > # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was > only 1) > I believe the lease should be recovered and the block should be marked > missing. However this is not happening. The lease is never recovered. > The effect of this bug for us was that nodes could not be decommissioned > cleanly. Although we knew that the client had crashed, the Namenode never > released the leases (even after restarting the Namenode) (even months > afterwards). There are actually several other cases too where we don't > consider what happens if ALL the datanodes die while the file is being > written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697977#comment-14697977 ] Sean Busbey commented on HDFS-8344: --- This has been a recurring problem for both HBase and Accumulo in test rigs. I don't think we care if the value is configurable so long as it is guaranteed to terminate and does so in a reasonably short (single-digit-seconds) period of time since it is in our recovery paths. > NameNode doesn't recover lease for files with missing blocks > > > Key: HDFS-8344 > URL: https://issues.apache.org/jira/browse/HDFS-8344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Fix For: 2.8.0 > > Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, > HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, > HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch, HDFS-8344.09.patch > > > I found another\(?) instance in which the lease is not recovered. This is > reproducible easily on a pseudo-distributed single node cluster > # Before you start it helps if you set. This is not necessary, but simply > reduces how long you have to wait > {code} > public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; > public static final long LEASE_HARDLIMIT_PERIOD = 2 * > LEASE_SOFTLIMIT_PERIOD; > {code} > # Client starts to write a file. (could be less than 1 block, but it hflushed > so some of the data has landed on the datanodes) (I'm copying the client code > I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) > # Client crashes. (I simulate this by kill -9 the $(hadoop jar > TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter" > # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was > only 1) > I believe the lease should be recovered and the block should be marked > missing. However this is not happening. The lease is never recovered. > The effect of this bug for us was that nodes could not be decommissioned > cleanly. Although we knew that the client had crashed, the Namenode never > released the leases (even after restarting the Namenode) (even months > afterwards). There are actually several other cases too where we don't > consider what happens if ALL the datanodes die while the file is being > written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8406) Lease recovery continually failed
[ https://issues.apache.org/jira/browse/HDFS-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-8406: -- Labels: Accumulo HBase (was: ) > Lease recovery continually failed > - > > Key: HDFS-8406 > URL: https://issues.apache.org/jira/browse/HDFS-8406 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Keith Turner > Labels: Accumulo, HBase > > While testing Accumulo on a cluster and killing processes, I ran into a > situation where the lease on an accumulo write ahead log in HDFS could not be > recovered. Even restarting HDFS and Accumulo would not fix the problem. > The following message was seen in an Accumulo tablet server log immediately > before the tablet server was killed. > {noformat} > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : DFSOutputStream > ResponseProcessor exception for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 > java.io.IOException: Bad response ERROR for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 from datanode > 10.1.5.9:50010 > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:897) > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : Error Recovery for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 in pipeline > 10.1.5.55:50010, 10.1.5.9:5 > {noformat} > Before recovering data from a write ahead log, the Accumulo master attempts > to recover the lease. This repeatedly failed with messages like the > following. > {noformat} > 2015-05-14 17:14:54,301 [recovery.HadoopLogCloser] WARN : Error recovering > lease on > hdfs://10.1.5.6:1/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_950713214_16 for client 10.1.5.158 because > pendingCreates is non-null but no leases found. > {noformat} > Below is some info from the NN logs for the problematic file. > {noformat} > [ec2-user@leader2 logs]$ grep 3a731759-3594-4535-8086-245 > hadoop-ec2-user-namenode-leader2.log > 2015-05-14 17:10:46,299 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2. > BP-802741494-10.1.5.6-1431557089849 > blk_1073932823_192060{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > 2015-05-14 17:10:46,628 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > fsync: /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 from > client DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > 2015-05-14 17:14:49,289 WARN org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.internalReleaseLease: File > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 has not been > closed. Lease recovery is in progress. RecoveryId = 192257 for block > blk_1073932823_192060{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > java.lang.IllegalStateException: Failed to finalize INodeFile > 3a731759-3594-4535-8086-245eed7cd4c2 since blocks[0] is non-complete, where > blocks=[blk_1073932823_192257{blockUCState=COMMITTED, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]}]. > 2015-05-14 17:14:54,292 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 1 on 1, call org.apache.hadoo
[jira] [Commented] (HDFS-6564) Use slf4j instead of common-logging in hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592976#comment-14592976 ] Sean Busbey commented on HDFS-6564: --- Since before removal the LOG was commons.logging, I would like to include how to get the member from that framework. Including how to do it in slf4j as well sounds fine. > Use slf4j instead of common-logging in hdfs-client > -- > > Key: HDFS-6564 > URL: https://issues.apache.org/jira/browse/HDFS-6564 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Rakesh R > Attachments: HDFS-6564-01.patch, HDFS-6564-02.patch, > HDFS-6564-03.patch > > > hdfs-client should depends on slf4j instead of common-logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6564) Use slf4j instead of common-logging in hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592927#comment-14592927 ] Sean Busbey commented on HDFS-6564: --- You should include mention of how to get the same functionality. In this case, they can retrieve the named logger via the logging framework of their choice directly. Since the previous member was commons-logging object, that would be done via {{LogFactory.getLog(CachePoolInfo.class)}} or {{LogFactory.getLog("org.apache.hadoop.hdfs.protocol.CachePoolInfo")}}. > Use slf4j instead of common-logging in hdfs-client > -- > > Key: HDFS-6564 > URL: https://issues.apache.org/jira/browse/HDFS-6564 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Rakesh R > Attachments: HDFS-6564-01.patch, HDFS-6564-02.patch, > HDFS-6564-03.patch > > > hdfs-client should depends on slf4j instead of common-logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6564) Use slf4j instead of common-logging in hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592831#comment-14592831 ] Sean Busbey commented on HDFS-6564: --- Whatever you decide to do about the LOG member, if it involves the class changing or it being removed please remember to flag the change and make a release note. That way downstream folks find out about it up front rather than by surprise when they are broken. > Use slf4j instead of common-logging in hdfs-client > -- > > Key: HDFS-6564 > URL: https://issues.apache.org/jira/browse/HDFS-6564 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Rakesh R > Attachments: HDFS-6564-01.patch, HDFS-6564-02.patch > > > hdfs-client should depends on slf4j instead of common-logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6564) Use slf4j instead of common-logging in hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590511#comment-14590511 ] Sean Busbey commented on HDFS-6564: --- the patch look good. have a draft of the needed release note? > Use slf4j instead of common-logging in hdfs-client > -- > > Key: HDFS-6564 > URL: https://issues.apache.org/jira/browse/HDFS-6564 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Rakesh R > Attachments: HDFS-6564-01.patch, HDFS-6564-02.patch > > > hdfs-client should depends on slf4j instead of common-logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6564) Use slf4j instead of common-logging in hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589939#comment-14589939 ] Sean Busbey commented on HDFS-6564: --- {code} diff --git hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolInfo.java hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolInfo.java index 61bbe38..31850dc 100644 --- hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolInfo.java +++ hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/CachePoolInfo.java @@ -24,13 +24,13 @@ import org.apache.commons.lang.builder.EqualsBuilder; import org.apache.commons.lang.builder.HashCodeBuilder; -import org.apache.commons.logging.Log; -import org.apache.commons.logging.LogFactory; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.classification.InterfaceStability; import org.apache.hadoop.fs.InvalidRequestException; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.hadoop.hdfs.protocol.CacheDirectiveInfo.Expiration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** * CachePoolInfo describes a cache pool. @@ -41,7 +41,8 @@ @InterfaceAudience.Public @InterfaceStability.Evolving public class CachePoolInfo { - public static final Log LOG = LogFactory.getLog(CachePoolInfo.class); + public static final Logger LOG = LoggerFactory + .getLogger(CachePoolInfo.class); {code} This is a binary and source incompatible change to public API. Please be sure to flag the jira as such and provide release notes. {code} @@ -733,9 +723,7 @@ final T getResponse(HttpURLConnection conn) throws IOException { } catch (Exception e) { // catch json parser errors final IOException ioe = new IOException("Response decoding failure: "+e.toString(), e); -if (LOG.isDebugEnabled()) { - LOG.debug(ioe); -} +LOG.debug("Response decoding failure: {}", e.toString(), e); throw ioe; {code} nit: it's possible, though unlikely, that this e.toString call is still expensive and worth guarding with isDebugEnabled. > Use slf4j instead of common-logging in hdfs-client > -- > > Key: HDFS-6564 > URL: https://issues.apache.org/jira/browse/HDFS-6564 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Rakesh R > Attachments: HDFS-6564-01.patch > > > hdfs-client should depends on slf4j instead of common-logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8135) Remove the deprecated FSConstants class
[ https://issues.apache.org/jira/browse/HDFS-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561205#comment-14561205 ] Sean Busbey commented on HDFS-8135: --- HBase has a temporary work around that relies on the private HdfsConstants in HBASE-13716. That will help us with future patch-releases on HBase's 1.y release line, but it order to list Hadoop 2.8+ as okay for HBase 1.0 and 1.1 (the two minor releases that are already live) we'll need this change reverted from branch-2 so that the earlier versions of those branches will work at runtime. If possible before our next set of releases, we'd like some guidance on what Hadoop considers the correct way to get the same information we want, i.e. "is HDFS in safemode?" Some details from [~Apache9] over on HBASE-13716: {quote} Add I check the code again, HdfsUtils.isHealthy(URI) calls DistributedFileSystem.setSafeMode(GET, false), but in HBase we calls DistributedFileSystem.setSafeMode(GET, true). I think the difference is when the second parameter is true then BackupNN will throw a StandByException that force client to connect to ActiveNN. If we must connect to ActiveNN in HBase, then HdfsUtils.isHealthy(URI) is not enough. So add new methods in HdfsUtils? {quote} On the general issue of HBase's "true dependencies on Hadoop" we have an umbrella issue now to ensure that by HBase 2.0 we have a well defined interface point: HBASE-13740. In the mean-time, I could add a nightly build job that both projects get notice of that attempts to build the current HBase branch-1 against the current Hadoop branch-2. > Remove the deprecated FSConstants class > --- > > Key: HDFS-8135 > URL: https://issues.apache.org/jira/browse/HDFS-8135 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: HDFS-8135-041315.patch > > > The {{FSConstants}} class has been marked as deprecated since 0.23. There is > no uses of this class in the current code base and it can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8135) Remove the deprecated FSConstants class
[ https://issues.apache.org/jira/browse/HDFS-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552405#comment-14552405 ] Sean Busbey commented on HDFS-8135: --- So long as there is a {{HdfsConstants.SafeModeAction}} and the same is available in FSConstants, then HBase can keep listing branch-2 versions as usable with the HBase 1.y versions we have released. We're happy to move to something else entirely for checking if HDFS is in safe mode, but what version of HBase we can do that in depends on what versions of HDFS the method is available in. The 0.98 branch needs things that start at Hadoop 2.2, HBase 1.y needs things to start at Hadoop 2.4 (and preferably 2.2). > Remove the deprecated FSConstants class > --- > > Key: HDFS-8135 > URL: https://issues.apache.org/jira/browse/HDFS-8135 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: HDFS-8135-041315.patch > > > The {{FSConstants}} class has been marked as deprecated since 0.23. There is > no uses of this class in the current code base and it can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8135) Remove the deprecated FSConstants class
[ https://issues.apache.org/jira/browse/HDFS-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey reopened HDFS-8135: --- This change breaks HBase. The comment at the start of the removed class was {code} - * @deprecated Please use {@link HdfsConstants}. This class - * is left only for other ecosystem projects which depended on - * it for SafemodeAction and DatanodeReport types. - */ {code} A few things # please mark this change as breaking and include a release note, since hte javadocs expressly say it was there for ecosystem projecs (even though it does not carry a proper InterfaceAudience annotation) # consider limiting this change to trunk and leave it out of branch-2 # HdfsConstants is labeled InterfaceAudience.Private, so what am I supposed to move HBase to? > Remove the deprecated FSConstants class > --- > > Key: HDFS-8135 > URL: https://issues.apache.org/jira/browse/HDFS-8135 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: HDFS-8135-041315.patch > > > The {{FSConstants}} class has been marked as deprecated since 0.23. There is > no uses of this class in the current code base and it can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8345) Storage policy APIs must be exposed via the FileSystem interface
[ https://issues.apache.org/jira/browse/HDFS-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546233#comment-14546233 ] Sean Busbey commented on HDFS-8345: --- {quote} bq. Can you point me to the exposure? BlockStoragePolicy is IA.Private, and I would expect we can use covariant return types to make a source-compatible update. It was tagged IA.Private but returned via a LimitedPrivate interface, unfortunate as that is. Changing the name, package or public interface of BlockStoragePolicy would technically be incompatible. However if you suggest a better name for the new interface I'd be happy to change it. {quote} Which LimitedPrivate though? I'd like to make sure we can't do the change in a compatible way. > Storage policy APIs must be exposed via the FileSystem interface > > > Key: HDFS-8345 > URL: https://issues.apache.org/jira/browse/HDFS-8345 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Labels: BB2015-05-TBR > Attachments: HDFS-8345.01.patch, HDFS-8345.02.patch, > HDFS-8345.03.patch, HDFS-8345.04.patch, HDFS-8345.05.patch, > HDFS-8345.06.patch, HDFS-8345.07.patch > > > The storage policy APIs are not exposed via FileSystem. Since > DistributedFileSystem is tagged as LimitedPrivate we should expose the APIs > through FileSystem for use by other applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8345) Storage policy APIs must be exposed via the FileSystem interface
[ https://issues.apache.org/jira/browse/HDFS-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546117#comment-14546117 ] Sean Busbey commented on HDFS-8345: --- {quote} bq. It would be nice if the returned values from getAllStoragePolicies were usable in a setStoragePolicy method. getAllStoragePolicies returns a collection, you can use BlockStoragePolicySpi#getName on a collection element as the parameter to setStoragePolicy. {quote} That's not so bad. Is there a reason not to include a convenience method that just takes a BlockStoragePolicySpi object directly? {quote} bq. Could we update the FileSystem specification documents? Which specification docs are you referring to? {quote} The one in the docs: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html {quote} bq. Could we switch to calling the interface BlockStoragePolicy and make the one in HDFS "DFSBlockStoragePolicy" or the like? "Spi" is a term of art that's going to be less accessible for folks. bq. Should the various return types in BlockStoragePolicySpi be returning List or Set or SortedSet? I agree on both points. However it's likely incompatible since BlockStoragePolicy was exposed via a limited private interface in 2.6. {quote} Can you point me to the exposure? BlockStoragePolicy is IA.Private, and I would expect we can use covariant return types to make a source-compatible update. > Storage policy APIs must be exposed via the FileSystem interface > > > Key: HDFS-8345 > URL: https://issues.apache.org/jira/browse/HDFS-8345 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Labels: BB2015-05-TBR > Attachments: HDFS-8345.01.patch, HDFS-8345.02.patch, > HDFS-8345.03.patch, HDFS-8345.04.patch, HDFS-8345.05.patch, HDFS-8345.06.patch > > > The storage policy APIs are not exposed via FileSystem. Since > DistributedFileSystem is tagged as LimitedPrivate we should expose the APIs > through FileSystem for use by other applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8332) DFS client API calls should check filesystem closed
[ https://issues.apache.org/jira/browse/HDFS-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546107#comment-14546107 ] Sean Busbey commented on HDFS-8332: --- +1 for trunk/branc-3 only. A "Known Issue" note on the next set of branch-2 release notes would be a nice-to-have as well. {quote} Also, I'd like to suggest that we change pre-commit to trigger hadoop-hdfs-httpfs tests automatically for all hadoop-hdfs patches. We've seen problems like this in the past. hadoop-hdfs-httpfs gets patched so infrequently that it's easy to miss it when a hadoop-hdfs change introduces a test failure. As a practical matter, we might not be able to add those tests until the current HDFS test runs get optimized. {quote} Leave a note on HADOOP-11929, [~aw] is already specifying that hadoop-hdfs needs to have hadoop-common built with native bits. Not sure if expanding to "under tests always do this other module if this module changes" will be in scope or a new ticket. > DFS client API calls should check filesystem closed > --- > > Key: HDFS-8332 > URL: https://issues.apache.org/jira/browse/HDFS-8332 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8332-000.patch, HDFS-8332-001.patch, > HDFS-8332-002-Branch-2.patch, HDFS-8332-002.patch, > HDFS-8332.001.branch-2.patch > > > I could see {{listCacheDirectives()}} and {{listCachePools()}} APIs can be > called even after the filesystem close. Instead these calls should do > {{checkOpen}} and throws: > {code} > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8345) Storage policy APIs must be exposed via the FileSystem interface
[ https://issues.apache.org/jira/browse/HDFS-8345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545531#comment-14545531 ] Sean Busbey commented on HDFS-8345: --- It would be nice if the returned values from getAllStoragePolicies were usable in a setStoragePolicy method. Could we update the FileSystem specification documents? {code} + * Set the storage policy for a given file or directory. + * + * @param path file or directory path. + * @param policyName the name of the target storage policy. The list + * of supported Storage policies can be retrieved + * via {@link #getStoragePolicyNames}. + */ + public void setStoragePolicy(final Path path, final String policyName) + throws IOException { {code} Note that it's an optional operation and some implementations will throw UnsupportOperationException. {code} + /** + * Retrieve all the storage policies supported by this file system. + * + * @return all storage policies supported by this filesystem. + * @throws IOException + */ + public List getAllStoragePolicies() + throws IOException { {code} Note that it's an optional operation and some implementations will throw UnsupportOperationException. {code} + * @param policyName the name of the target storage policy. The list + * of supported Storage policies can be retrieved + * via {@link #getStoragePolicyNames}. + */ {code} The method "getStoragePolicyNames" didn't make it into the patch. Update to point at new method? {code} + public List getAllStoragePolicies() + throws IOException { {code} Should this return type be Collection ? does the ordering matter? Maybe it should be a Set? {code} +public interface BlockStoragePolicySpi { + {code} Could we switch to calling the interface BlockStoragePolicy and make the one in HDFS "DFSBlockStoragePolicy" or the like? "Spi" is a term of art that's going to be less accessible for folks. {code} + StorageType[] getStorageTypes(); {code} Should the various return types in BlockStoragePolicySpi be returning List or Set or SortedSet? {code} return fs.createNonRecursive(f, permission, flags, bufferSize, replication, blockSize, -progress); + progress); } {code} nit: this whitespace fix doesn't appear related to the rest of the changes in the file (but I'm not sure what Hadoop's norm is for 'related') > Storage policy APIs must be exposed via the FileSystem interface > > > Key: HDFS-8345 > URL: https://issues.apache.org/jira/browse/HDFS-8345 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Labels: BB2015-05-TBR > Attachments: HDFS-8345.01.patch, HDFS-8345.02.patch, > HDFS-8345.03.patch, HDFS-8345.04.patch, HDFS-8345.05.patch > > > The storage policy APIs are not exposed via FileSystem. Since > DistributedFileSystem is tagged as LimitedPrivate we should expose the APIs > through FileSystem for use by other applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8332) DFS client API calls should check filesystem closed
[ https://issues.apache.org/jira/browse/HDFS-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545457#comment-14545457 ] Sean Busbey commented on HDFS-8332: --- Being incompatible and breaking some tests are two different problems. It's true that just because tests fail it does not mean a change is incompatible. However, this change is still incompatible. * The [FileSystem specification doesn't say that all operations must fail after a close|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html] * Neither does the javadoc on FileSystem.close (though it does imply it) * The [specification specifically says that HDFS' behavior is correct|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/extending.html] I agree that this change is good and one we should do. However, it *will* break some downstream user code that worked before. A good sign of this is that it broke some code maintained by the Hadoop community, ostensibly those most familiar with how HDFS works. It's important that we properly document when we change things in a way that might break downstream users (wether or not they were doing the correct thing before) so that they can make appropriate adjustments before upgrading, especially when those changes are in a minor version. > DFS client API calls should check filesystem closed > --- > > Key: HDFS-8332 > URL: https://issues.apache.org/jira/browse/HDFS-8332 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8332-000.patch, HDFS-8332-001.patch, > HDFS-8332-002-Branch-2.patch, HDFS-8332-002.patch, > HDFS-8332.001.branch-2.patch > > > I could see {{listCacheDirectives()}} and {{listCachePools()}} APIs can be > called even after the filesystem close. Instead these calls should do > {{checkOpen}} and throws: > {code} > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8412) Fix the test failures in HTTPFS
[ https://issues.apache.org/jira/browse/HDFS-8412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545437#comment-14545437 ] Sean Busbey commented on HDFS-8412: --- +1 (non-binding) > Fix the test failures in HTTPFS > --- > > Key: HDFS-8412 > URL: https://issues.apache.org/jira/browse/HDFS-8412 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G > Attachments: HDFS-8412-0.patch > > > Currently 2 HTTFS test cases failing due to filesystem open check in fs > operations > This is the JIRA fix these failures. > Failure seems like > test case is closing fs first and then doing operation. Ideally such test > could pas earlier as dfsClient was just contacting with NN directly. But that > particular closed client will not be useful for any other ops like > read/write. So, usage should be corrected here IMO. > {code} > fs.close(); > fs.setReplication(path, (short) 2); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8332) DFS client API calls should check filesystem closed
[ https://issues.apache.org/jira/browse/HDFS-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545141#comment-14545141 ] Sean Busbey commented on HDFS-8332: --- also, please release note this as an incompatible change in behavior. > DFS client API calls should check filesystem closed > --- > > Key: HDFS-8332 > URL: https://issues.apache.org/jira/browse/HDFS-8332 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8332-000.patch, HDFS-8332-001.patch, > HDFS-8332-002-Branch-2.patch, HDFS-8332-002.patch, > HDFS-8332.001.branch-2.patch > > > I could see {{listCacheDirectives()}} and {{listCachePools()}} APIs can be > called even after the filesystem close. Instead these calls should do > {{checkOpen}} and throws: > {code} > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8332) DFS client API calls should check filesystem closed
[ https://issues.apache.org/jira/browse/HDFS-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-8332: -- Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) > DFS client API calls should check filesystem closed > --- > > Key: HDFS-8332 > URL: https://issues.apache.org/jira/browse/HDFS-8332 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8332-000.patch, HDFS-8332-001.patch, > HDFS-8332-002-Branch-2.patch, HDFS-8332-002.patch, > HDFS-8332.001.branch-2.patch > > > I could see {{listCacheDirectives()}} and {{listCachePools()}} APIs can be > called even after the filesystem close. Instead these calls should do > {{checkOpen}} and throws: > {code} > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8332) DFS client API calls should check filesystem closed
[ https://issues.apache.org/jira/browse/HDFS-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey reopened HDFS-8332: --- According to git-bisect, this is the commit that started failing the following tests in hadoop-hdfs-httpfs on trunk: * TestHttpFSFWithSWebhdfsFileSystem * TestHttpFSWithHttpFSFileSystem * TestHttpFSFWithWebhdfsFileSystem {code} $ git bisect start trunk b46c2bb51ae524e6640756620f70e5925cda7592 Bisecting: 272 revisions left to test after this (roughly 8 steps) [baf8bc6c488de170d2caf76d9fa4c99faaa8f1a6] HDFS-4448. Allow HA NN to start in secure mode with wildcard address configured (atm via asuresh) $ git bisect run mvn -Dtest=TestHttpFSF*,TestHttpFSWithHttpFSFileSystem clean package ...SNIP... e16f4b7f70b8675760cf5aaa471dfe29d48041e6 is the first bad commit commit e16f4b7f70b8675760cf5aaa471dfe29d48041e6 Author: Uma Maheswara Rao G Date: Fri May 8 12:26:47 2015 +0530 HDFS-8332. DFS client API calls should check filesystem closed. Contributed by Rakesh R. :04 04 db7a6b4555c1bd18e8fe0a97a6689f7cf9ce15ec f9e0818f6198fbc0ac94b2d82bef7f065a90cc03 M hadoop-hdfs-project bisect run success {code} > DFS client API calls should check filesystem closed > --- > > Key: HDFS-8332 > URL: https://issues.apache.org/jira/browse/HDFS-8332 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8332-000.patch, HDFS-8332-001.patch, > HDFS-8332-002-Branch-2.patch, HDFS-8332-002.patch, > HDFS-8332.001.branch-2.patch > > > I could see {{listCacheDirectives()}} and {{listCachePools()}} APIs can be > called even after the filesystem close. Instead these calls should do > {{checkOpen}} and throws: > {code} > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8406) Lease recovery continually failed
[ https://issues.apache.org/jira/browse/HDFS-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544830#comment-14544830 ] Sean Busbey commented on HDFS-8406: --- I hit this on an HBase cluster a few weeks ago but was never able to track down what did it. At the time I presumed I had messed up the HDFS installation and just filed HBASE-13540 and HBASE-13602 to make it easier to work around. I might be able to track down some old logs from HBase hitting it if it'll help. > Lease recovery continually failed > - > > Key: HDFS-8406 > URL: https://issues.apache.org/jira/browse/HDFS-8406 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Keith Turner > > While testing Accumulo on a cluster and killing processes, I ran into a > situation where the lease on an accumulo write ahead log in HDFS could not be > recovered. Even restarting HDFS and Accumulo would not fix the problem. > The following message was seen in an Accumulo tablet server log immediately > before the tablet server was killed. > {noformat} > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : DFSOutputStream > ResponseProcessor exception for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 > java.io.IOException: Bad response ERROR for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 from datanode > 10.1.5.9:50010 > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:897) > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : Error Recovery for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 in pipeline > 10.1.5.55:50010, 10.1.5.9:5 > {noformat} > Before recovering data from a write ahead log, the Accumulo master attempts > to recover the lease. This repeatedly failed with messages like the > following. > {noformat} > 2015-05-14 17:14:54,301 [recovery.HadoopLogCloser] WARN : Error recovering > lease on > hdfs://10.1.5.6:1/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_950713214_16 for client 10.1.5.158 because > pendingCreates is non-null but no leases found. > {noformat} > Below is some info from the NN logs for the problematic file. > {noformat} > [ec2-user@leader2 logs]$ grep 3a731759-3594-4535-8086-245 > hadoop-ec2-user-namenode-leader2.log > 2015-05-14 17:10:46,299 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2. > BP-802741494-10.1.5.6-1431557089849 > blk_1073932823_192060{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > 2015-05-14 17:10:46,628 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > fsync: /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 from > client DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > 2015-05-14 17:14:49,289 WARN org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.internalReleaseLease: File > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 has not been > closed. Lease recovery is in progress. RecoveryId = 192257 for block > blk_1073932823_192060{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > java.lang.IllegalStateException: Failed to finalize INodeFile > 3a731759-3594-4535-8086-245eed7cd4c2 since blocks[0] is non-complete, where > blocks=[blk_1073932823_192257{blockUCState=COMMITTED, primaryNodeIndex=2, > replicas=[ReplicaUnderConstruction[[D
[jira] [Commented] (HDFS-8381) Reduce time taken for complete HDFS unit test run
[ https://issues.apache.org/jira/browse/HDFS-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541241#comment-14541241 ] Sean Busbey commented on HDFS-8381: --- It really helped the HBase builds when we categorized tests by run time and functional component. Then it was much easier to decide what should run for default dev builds and various jenkins jobs. > Reduce time taken for complete HDFS unit test run > - > > Key: HDFS-8381 > URL: https://issues.apache.org/jira/browse/HDFS-8381 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Arpit Agarwal > > HDFS unit tests take a long time to run. Our unit tests are more like > system/integration tests since we spin up a MiniDFSCluster for individual > test cases. A number of tests have sleeps which further adds to the run time. > A better option is to use more fine-grained unit tests specific to individual > classes. I did not find any existing Jiras for this so filing one to track > this work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5461) fallback to non-ssr(local short circuit reads) while oom detected
[ https://issues.apache.org/jira/browse/HDFS-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-5461: -- Status: Open (was: Patch Available) [~xieliang007] are you still interested in working on this patch? can you update it based on the feedback thus far? My summary from reading through: * Add a configuration for "maximum direct buffer pool size in bytes" * Trigger fall-back when over this limit * update the "used memory" stat to be in bytes * add a test that sets the max pool size very small and trigger the limit * make sure we have condition testing for both BlockReaderLocal creation and BlockReaderFactory.getLegacyBlockReaderLocal > fallback to non-ssr(local short circuit reads) while oom detected > - > > Key: HDFS-5461 > URL: https://issues.apache.org/jira/browse/HDFS-5461 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.2.0, 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie > Labels: BB2015-05-TBR > Attachments: HDFS-5461.txt > > > Currently, the DirectBufferPool used by ssr feature seems doesn't have a > upper-bound limit except DirectMemory VM option. So there's a risk to > encounter direct memory oom. see HBASE-8143 for example. > IMHO, maybe we could improve it a bit: > 1) detect OOM or reach a setting up-limit from caller, then fallback to > non-ssr > 2) add a new metric about current raw consumed direct memory size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5270) Use thread pools in the datenode daemons
[ https://issues.apache.org/jira/browse/HDFS-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-5270: -- Status: Open (was: Patch Available) > Use thread pools in the datenode daemons > > > Key: HDFS-5270 > URL: https://issues.apache.org/jira/browse/HDFS-5270 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Haohui Mai >Assignee: Haohui Mai > Labels: BB2015-05-TBR > Attachments: HDFS-5270.000.patch, HDFS-5270.2.patch, > TestConcurrentAccess.java > > > The current implementation of the datanode creates a thread when a new > request comes in. This incurs high overheads for the creation / destruction > of threads, making the datanode unstable under high concurrent loads. > This JIRA proposes to use a thread pool to reduce the overheads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5270) Use thread pools in the datenode daemons
[ https://issues.apache.org/jira/browse/HDFS-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-5270: -- Status: Patch Available (was: Open) > Use thread pools in the datenode daemons > > > Key: HDFS-5270 > URL: https://issues.apache.org/jira/browse/HDFS-5270 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Haohui Mai >Assignee: Haohui Mai > Labels: BB2015-05-TBR > Attachments: HDFS-5270.000.patch, HDFS-5270.2.patch, > TestConcurrentAccess.java > > > The current implementation of the datanode creates a thread when a new > request comes in. This incurs high overheads for the creation / destruction > of threads, making the datanode unstable under high concurrent loads. > This JIRA proposes to use a thread pool to reduce the overheads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5270) Use thread pools in the datenode daemons
[ https://issues.apache.org/jira/browse/HDFS-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-5270: -- Attachment: HDFS-5270.2.patch patch rebased to current trunk. Attaching for a full qa bot run. There are a few test failures when I attempt to run through things locally. {code} $ mvn -Dtest=TestBlock*,TestDataNode* package ... SNIP ... Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 146.067 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS testWrite(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS) Time elapsed: 62.509 sec <<< ERROR! java.io.IOException: All datanodes DatanodeInfoWithStorage[127.0.0.1:54558,DS-bc806196-a774-4af3-afe7-d6d88c53d15b,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1224) at org.apache.hadoop.hdfs.DataStreamer.processDatanodeError(DataStreamer.java:1016) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:403) testAppend(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS) Time elapsed: 62.484 sec <<< ERROR! java.io.IOException: All datanodes DatanodeInfoWithStorage[127.0.0.1:54662,DS-116db650-c3c9-4dee-9c3d-4343f12888d8,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1224) at org.apache.hadoop.hdfs.DataStreamer.processDatanodeError(DataStreamer.java:1016) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:403) ... SNIP ... Running org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.564 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery testRaceBetweenReplicaRecoveryAndFinalizeBlock(org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery) Time elapsed: 5.98 sec <<< FAILURE! java.lang.AssertionError: Recovery should be initiated successfully at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery.testRaceBetweenReplicaRecoveryAndFinalizeBlock(TestBlockRecovery.java:638) ...SNIP... {code} [~wheat9], are you still interested in this ticket? Presuming the above come back as a problem on jenkins, could you take a look at these failures? > Use thread pools in the datenode daemons > > > Key: HDFS-5270 > URL: https://issues.apache.org/jira/browse/HDFS-5270 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Haohui Mai >Assignee: Haohui Mai > Labels: BB2015-05-TBR > Attachments: HDFS-5270.000.patch, HDFS-5270.2.patch, > TestConcurrentAccess.java > > > The current implementation of the datanode creates a thread when a new > request comes in. This incurs high overheads for the creation / destruction > of threads, making the datanode unstable under high concurrent loads. > This JIRA proposes to use a thread pool to reduce the overheads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-8101: -- Status: Patch Available (was: In Progress) > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > Attachments: HDFS-8101.1.patch.txt > > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-8101: -- Attachment: HDFS-8101.1.patch.txt Manually inspected javap output for DFSConfigKeys and NameNodeHttpServer (it's what uses AuthFilter) to verify that NameNodeHttpServer didn't change. Checked DFSConfigKeys for other webhdfs class references. > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > Attachments: HDFS-8101.1.patch.txt > > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-8101 started by Sean Busbey. - > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-8101: -- Summary: DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime (was: DFSConfigKeys pulls in WebHDFS classes at runtime) > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8101) DFSConfigKeys pulls in WebHDFS classes at runtime
Sean Busbey created HDFS-8101: - Summary: DFSConfigKeys pulls in WebHDFS classes at runtime Key: HDFS-8101 URL: https://issues.apache.org/jira/browse/HDFS-8101 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Previously, all references to DFSConfigKeys in DFSClient were compile time constants which meant that normal users of DFSClient wouldn't resolve DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a member of DFSConfigKeys that isn't compile time constant (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). Since the class must be resolved now, this particular member {code} public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = AuthFilter.class.getName(); {code} means that javax.servlet.Filter needs to be on the classpath. javax-servlet-api is one of the properly listed dependencies for HDFS, however if we replace {{AuthFilter.class.getName()}} with the equivalent String literal then downstream folks can avoid including it while maintaining compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7911) Buffer Overflow when running HBase on HDFS Encryption Zone
[ https://issues.apache.org/jira/browse/HDFS-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359399#comment-14359399 ] Sean Busbey commented on HDFS-7911: --- [~xyao] could you test the patch over on HADOOP-11710 to see if it fixes the problem for you? > Buffer Overflow when running HBase on HDFS Encryption Zone > -- > > Key: HDFS-7911 > URL: https://issues.apache.org/jira/browse/HDFS-7911 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption >Affects Versions: 2.6.0 >Reporter: Xiaoyu Yao >Assignee: Yi Liu >Priority: Blocker > > Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, > including creating tables, listing, adding a few rows, scanning them, etc. > However, when doing bulk load 100's k rows. After 10 minutes or so, we get > the following error on the Region Server that owns the table. > {code} > 2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] > wal.FSHLog: Error while AsyncSyncer sync, request close of hlog > java.io.IOException: java.nio.BufferOverflowException > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156) > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127) > at > org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162) > > at > org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232) > > at > org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267) > > at > org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) > at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165) > > at > org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241) > > at java.lang.Thread.run(Thread.java:744) > Caused by: java.nio.BufferOverflowException > at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) > at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) > at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) > at javax.crypto.Cipher.update(Cipher.java:1760) > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145) > ... 9 more > {code} > It looks like the HBase WAL (Write Ahead Log) use case is broken on the > CryptoOutputStream(). The use case has one flusher thread that keeps calling > the hflush() on WAL file while other roller threads are trying to write > concurrently to that same file handle. > As the class comments mentioned. *""CryptoOutputStream encrypts data. It is > not thread-safe."* I check the code and it seems the buffer overflow is > related to the race between the CryptoOutputStream#write() and > CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The > inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be > changed during encrypt for flush() when write() is coming from other threads. > I have validated this with multi-threaded unit tests that mimic the HBase WAL > use case. For file not under encryption zone (*DFSOutputStream*), > multi-threaded flusher/writer works fine. For file under encryption zone > (*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with > Buffer Overflow/Underflow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7911) Buffer Overflow when running HBase on HDFS Encryption Zone
[ https://issues.apache.org/jira/browse/HDFS-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358951#comment-14358951 ] Sean Busbey commented on HDFS-7911: --- We should take a more general fix and instead move the synchronization to FSDataOutputStream, as suggested in HADOOP-11708 > Buffer Overflow when running HBase on HDFS Encryption Zone > -- > > Key: HDFS-7911 > URL: https://issues.apache.org/jira/browse/HDFS-7911 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption >Affects Versions: 2.6.0 >Reporter: Xiaoyu Yao >Assignee: Yi Liu >Priority: Blocker > > Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, > including creating tables, listing, adding a few rows, scanning them, etc. > However, when doing bulk load 100's k rows. After 10 minutes or so, we get > the following error on the Region Server that owns the table. > {code} > 2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] > wal.FSHLog: Error while AsyncSyncer sync, request close of hlog > java.io.IOException: java.nio.BufferOverflowException > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156) > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127) > at > org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162) > > at > org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232) > > at > org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267) > > at > org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) > at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165) > > at > org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241) > > at java.lang.Thread.run(Thread.java:744) > Caused by: java.nio.BufferOverflowException > at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) > at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) > at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) > at javax.crypto.Cipher.update(Cipher.java:1760) > at > org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145) > ... 9 more > {code} > It looks like the HBase WAL (Write Ahead Log) use case is broken on the > CryptoOutputStream(). The use case has one flusher thread that keeps calling > the hflush() on WAL file while other roller threads are trying to write > concurrently to that same file handle. > As the class comments mentioned. *""CryptoOutputStream encrypts data. It is > not thread-safe."* I check the code and it seems the buffer overflow is > related to the race between the CryptoOutputStream#write() and > CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The > inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be > changed during encrypt for flush() when write() is coming from other threads. > I have validated this with multi-threaded unit tests that mimic the HBase WAL > use case. For file not under encryption zone (*DFSOutputStream*), > multi-threaded flusher/writer works fine. For file under encryption zone > (*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with > Buffer Overflow/Underflow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7919) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable
[ https://issues.apache.org/jira/browse/HDFS-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HDFS-7919: -- Labels: beginner (was: ) > Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of > method variable > - > > Key: HDFS-7919 > URL: https://issues.apache.org/jira/browse/HDFS-7919 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ajith S >Assignee: Ajith S >Priority: Trivial > Labels: beginner > > NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of > creating it in each method call. > {code} > org.apache.hadoop.util.Time.java > public static long monotonicNow() { > final long NANOSECONDS_PER_MILLISECOND = 100; > return System.nanoTime() / NANOSECONDS_PER_MILLISECOND; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350711#comment-14350711 ] Sean Busbey commented on HDFS-6200: --- As I mentioned earlier, the dependencies your client artifact brings with it is a defining part of the interface you are exposing downstream applications to. That means we need the ability to manipulate those dependencies, even if we're only going to do so at a later date. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. I only mention the internal artifact as an alternative if having DFSClient live in hadoop-hdfs is undesirable. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain. However, there might be other mitigating factors I'm not aware of that make breaking the code into a new module desirable. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350369#comment-14350369 ] Sean Busbey commented on HDFS-6200: --- The dependencies you bring with you are an integral part of the interface you define for downstream clients. While I agree that it can be a separate subtask, it has to be considered as part of how you structure the overall approach. {quote} Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS. {quote} Our own internal use of client interfaces isn't the same thing as downstream application uses. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. {noformat} hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal {noformat} > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6200) Create a separate jar for hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349520#comment-14349520 ] Sean Busbey commented on HDFS-6200: --- Since this new artifact is opt-in (since clients would have to change to it), we could instead use it to build an aggregate jar with no transitive dependencies. For this approach, we should not make the old hadoop-hdfs depend on it (since there will presumably be some shaded or otherwise isolated version of third party libraries present). We could still do the move incrementally by relying on maven to build the artifact with just those classes we need from hadoop-hdfs. That way, extant downstream applications who want to keep the current behavior can keep depending on hadoop-hdfs (or hadoop-client or whatever), and downstream applications who want the improved client dependency can change. When we're ready for a breaking change, we similarly announce that downstream applications should not be relying on hadoop-hdfs. > Create a separate jar for hdfs-client > - > > Key: HDFS-6200 > URL: https://issues.apache.org/jira/browse/HDFS-6200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-6200.000.patch, HDFS-6200.001.patch, > HDFS-6200.002.patch, HDFS-6200.003.patch, HDFS-6200.004.patch, > HDFS-6200.005.patch, HDFS-6200.006.patch, HDFS-6200.007.patch > > > Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs > client. As discussed in the hdfs-dev mailing list > (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), > downstream projects are forced to bring in additional dependency in order to > access hdfs. The additional dependency sometimes can be difficult to manage > for projects like Apache Falcon and Apache Oozie. > This jira proposes to create a new project, hadoop-hdfs-cliient, which > contains the client side of the hdfs code. Downstream projects can use this > jar instead of the hadoop-hdfs to avoid unnecessary dependency. > Note that it does not break the compatibility of downstream projects. This is > because old downstream projects implicitly depend on hadoop-hdfs-client > through the hadoop-hdfs jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7040) HDFS dangerously uses @Beta methods from very old versions of Guava
[ https://issues.apache.org/jira/browse/HDFS-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131641#comment-14131641 ] Sean Busbey commented on HDFS-7040: --- FWIW, HBase currently has a copied reimplementation of LimitedInputStream (since HBASE-9667) for this same reason. > HDFS dangerously uses @Beta methods from very old versions of Guava > --- > > Key: HDFS-7040 > URL: https://issues.apache.org/jira/browse/HDFS-7040 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0, 2.5.0, 2.4.1 >Reporter: Christopher Tubbs > Labels: beta, deprecated, guava > Attachments: 0001-HDFS-7040-Avoid-beta-LimitInputStream-in-Guava.patch > > > HDFS uses LimitInputStream from Guava. This was introduced as @Beta and is > risky for any application to use. > The problem is further exacerbated by Hadoop's dependency on Guava version > 11.0.2, which is quite old for an active project (Feb. 2012). > Because Guava is very stable, projects which depend on Hadoop and use Guava > themselves, can use up through Guava version 14.x > However, in version 14, Guava deprecated LimitInputStream and provided a > replacement. Because they make no guarantees about compatibility about @Beta > classes, they removed it in version 15. > What should be done: Hadoop should updated its dependency on Guava to at > least version 14 (currently Guava is on version 19). This should have little > impact on users, because Guava is so stable. > HDFS should then be patched to use the provided alternative to > LimitInputStream, so that downstream packagers, users, and application > developers requiring more recent versions of Guava (to fix bugs, to use new > features, etc.) will be able to swap out the Guava dependency without > breaking Hadoop. > Alternative: While Hadoop cannot predict the marking and removal of > deprecated code, it can, and should, avoid the use of @Beta classes and > methods that do not offer guarantees. If the dependency cannot be bumped, > then it should be relatively trivial to provide an internal class with the > same functionality, that does not rely on the older version of Guava. -- This message was sent by Atlassian JIRA (v6.3.4#6332)