[VOTE] Release Apache Hadoop 2.1.1-beta
Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
Thanks Arun. +1 * Downloaded source tarball. * Verified MD5 * Verified signature * run apache-rat:check ok after minor tweak (see NIT1 below) * checked CHANGES.txt headers (see NIT2 below) * built DIST from source * verified hadoop version of Hadoop JARs * configured pseudo cluster * tested HttpFS * run a few MR examples * run a few unmanaged AM app examples The following NITs should be addressed if there is a new RC or in the next release -- NIT1, empty files that make apache-rat:check to fail, these files should be removed: * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextSymlinkBaseTest.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFSFileContextSymlink.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestFcHdfsSymlink.java -- NIT2, common/hdfs/mapreduce/yarn CHANGES.txt have 2.2.0 header, they should not -- On Tue, Sep 17, 2013 at 8:38 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro
[jira] [Created] (HADOOP-9971) Pack hadoop compress native libs and upload it to maven for other projects to depend on
Liu Shaohui created HADOOP-9971: --- Summary: Pack hadoop compress native libs and upload it to maven for other projects to depend on Key: HADOOP-9971 URL: https://issues.apache.org/jira/browse/HADOOP-9971 Project: Hadoop Common Issue Type: Improvement Reporter: Liu Shaohui Priority: Minor Attachments: HADOOP-9971-trunk-v1.diff Currently, if other projects like hbase want to using hadoop common native lib, they must copy the native libs to their distribution, which is not agile. From the idea of hadoop-snappy(http://code.google.com/p/hadoop-snappy), we can pack the hadoop common native lib and upload it to maven repository for other projects to depend on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
Hey all, Sorry to hijack the vote thread, but it'd be good to get some input on my email from yesterday re: symlink support in branch-2.1. I think it really should be in GA one way or the other. http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201309.mbox/%3CCAGB5D2ZDjqt69oFfv_HOsWEH18T9GanuuF1Y%3DaKG-JptvV3ViA%40mail.gmail.com%3E Thanks, Andrew On Tue, Sep 17, 2013 at 2:23 AM, Alejandro Abdelnur t...@cloudera.comwrote: Thanks Arun. +1 * Downloaded source tarball. * Verified MD5 * Verified signature * run apache-rat:check ok after minor tweak (see NIT1 below) * checked CHANGES.txt headers (see NIT2 below) * built DIST from source * verified hadoop version of Hadoop JARs * configured pseudo cluster * tested HttpFS * run a few MR examples * run a few unmanaged AM app examples The following NITs should be addressed if there is a new RC or in the next release -- NIT1, empty files that make apache-rat:check to fail, these files should be removed: * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextSymlinkBaseTest.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFSFileContextSymlink.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestFcHdfsSymlink.java -- NIT2, common/hdfs/mapreduce/yarn CHANGES.txt have 2.2.0 header, they should not -- On Tue, Sep 17, 2013 at 8:38 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro
Re: symlink support in Hadoop 2 GA
I think it makes sense to finish symlinks support in the Hadoop 2 GA release. Colin On Mon, Sep 16, 2013 at 6:49 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I wanted to broadcast plans for putting the FileSystem symlinks work (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I think it's pretty important we get it in since it's not a compatible change; if it misses the GA train, we're not going to have symlinks until the next major release. However, we're still dealing with ongoing issues revealed via testing. There's user-code out there that only handles files and directories and will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912 for a nice example where globStatus returning symlinks broke Pig; some of us had a conference call to talk it through, and one definite conclusion was that this wasn't solvable in a generally compatible manner. There are also still some gaps in symlink support right now. For example, the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink resolution, and tooling like the FsShell and Distcp still need to be updated as well. So, there's definitely work to be done, but there are a lot of users interested in the feature, and symlinks really should be in GA. Would appreciate any thoughts/input on the matter. Thanks, Andrew
Re: symlink support in Hadoop 2 GA
I agree that this is an important change. However, 2.2.0 GA is getting ready to rollout in weeks. I am concerned that these changes will add not only incompatible changes late in the game, but also possibly instability. Java API incompatibility is some thing we have avoided for the most part and I am concerned that this is adding such incompatibility in FileSystem APIs. We should find work arounds by adding possibly newer APIs and leaving existing APIs as is. If this can be done, my vote is to enable this feature in 2.3. Even if it cannot be done, I am concerned that this is coming quite late and we should see if could allow some incompatible changes into 2.3 for this feature. On Mon, Sep 16, 2013 at 6:49 PM, Andrew Wang andrew.w...@cloudera.comwrote: Hi all, I wanted to broadcast plans for putting the FileSystem symlinks work (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I think it's pretty important we get it in since it's not a compatible change; if it misses the GA train, we're not going to have symlinks until the next major release. However, we're still dealing with ongoing issues revealed via testing. There's user-code out there that only handles files and directories and will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912 for a nice example where globStatus returning symlinks broke Pig; some of us had a conference call to talk it through, and one definite conclusion was that this wasn't solvable in a generally compatible manner. There are also still some gaps in symlink support right now. For example, the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink resolution, and tooling like the FsShell and Distcp still need to be updated as well. So, there's definitely work to be done, but there are a lot of users interested in the feature, and symlinks really should be in GA. Would appreciate any thoughts/input on the matter. Thanks, Andrew -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: symlink support in Hadoop 2 GA
The issue is not modifying existing APIs. The issue is that code has been written that makes assumptions that are incompatible with the existence of things that are not files or directories. For example, there is a lot of code out there that looks at FileStatus#isFile, and if it returns false, assumes that what it is looking at is a directory. In the case of a symlink, this assumption is incorrect. Faced with this, we have considered making the default behavior of listStatus and globStatus to be fully resolving symlinks, and simply not listing dangling symlinks. Code which is prepared to deal symlinks can use newer versions of the listStatus and globStatus functions which do return symlinks as symlinks. We might consider defaulting FileSystem#listStatus and FileSystem#globStatus to fully resolving symlinks by default and defaulting FileContext#listStatus and FileContext#Util#globStatus to the opposite. This seems like the maximally compatible solution that we're going to get. I think this makes sense. The alternative is kicking the can down the road to Hadoop 3, and letting vendors of alternative (including some proprietary alternative) systems continue to claim that Hadoop doesn't support symlinks yet (with some justice). P.S. I would be fine with putting this in 2.2 or 2.3 if that seems more appropriate. sincerely, Colin On Tue, Sep 17, 2013 at 8:23 AM, Suresh Srinivas sur...@hortonworks.com wrote: I agree that this is an important change. However, 2.2.0 GA is getting ready to rollout in weeks. I am concerned that these changes will add not only incompatible changes late in the game, but also possibly instability. Java API incompatibility is some thing we have avoided for the most part and I am concerned that this is adding such incompatibility in FileSystem APIs. We should find work arounds by adding possibly newer APIs and leaving existing APIs as is. If this can be done, my vote is to enable this feature in 2.3. Even if it cannot be done, I am concerned that this is coming quite late and we should see if could allow some incompatible changes into 2.3 for this feature. On Mon, Sep 16, 2013 at 6:49 PM, Andrew Wang andrew.w...@cloudera.comwrote: Hi all, I wanted to broadcast plans for putting the FileSystem symlinks work (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I think it's pretty important we get it in since it's not a compatible change; if it misses the GA train, we're not going to have symlinks until the next major release. However, we're still dealing with ongoing issues revealed via testing. There's user-code out there that only handles files and directories and will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912 for a nice example where globStatus returning symlinks broke Pig; some of us had a conference call to talk it through, and one definite conclusion was that this wasn't solvable in a generally compatible manner. There are also still some gaps in symlink support right now. For example, the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink resolution, and tooling like the FsShell and Distcp still need to be updated as well. So, there's definitely work to be done, but there are a lot of users interested in the feature, and symlinks really should be in GA. Would appreciate any thoughts/input on the matter. Thanks, Andrew -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks
Colin Patrick McCabe created HADOOP-9972: Summary: new APIs for listStatus and globStatus to deal with symlinks Key: HADOOP-9972 URL: https://issues.apache.org/jira/browse/HADOOP-9972 Project: Hadoop Common Issue Type: Improvement Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to deal with symlinks. The issue is that code has been written which is incompatible with the existence of things which are not files or directories. For example, there is a lot of code out there that looks at FileStatus#isFile, and if it returns false, assumes that what it is looking at is a directory. In the case of a symlink, this assumption is incorrect. It seems reasonable to make the default behavior of {{FileSystem#listStatus}} and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring dangling ones. This will prevent incompatibility with existing MR jobs and other HDFS users. We should also add new versions of listStatus and globStatus that allow new, symlink-aware code to deal with symlinks as symlinks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9973) wrong dependencies
Nicolas Liochon created HADOOP-9973: --- Summary: wrong dependencies Key: HADOOP-9973 URL: https://issues.apache.org/jira/browse/HADOOP-9973 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.1.0-beta, 2.1.1-beta Reporter: Nicolas Liochon Priority: Minor See HBASE-9557 for the impact: for some of them, it seems it's pushing these dependencies to the client applications even if they are not used. mvn dependency:analyze -pl hadoop-common [WARNING] Used undeclared dependencies found: [WARNING]com.google.code.findbugs:jsr305:jar:1.3.9:compile [WARNING]commons-collections:commons-collections:jar:3.2.1:compile [WARNING] Unused declared dependencies found: [WARNING]com.sun.jersey:jersey-json:jar:1.9:compile [WARNING]tomcat:jasper-compiler:jar:5.5.23:runtime [WARNING]tomcat:jasper-runtime:jar:5.5.23:runtime [WARNING]javax.servlet.jsp:jsp-api:jar:2.1:runtime [WARNING]commons-el:commons-el:jar:1.0:runtime [WARNING]org.slf4j:slf4j-log4j12:jar:1.7.5:runtime mvn dependency:analyze -pl hadoop-yarn-client [WARNING] Used undeclared dependencies found: [WARNING]org.mortbay.jetty:jetty-util:jar:6.1.26:provided [WARNING]log4j:log4j:jar:1.2.17:compile [WARNING]com.google.guava:guava:jar:11.0.2:provided [WARNING]commons-lang:commons-lang:jar:2.5:provided [WARNING]commons-logging:commons-logging:jar:1.1.1:provided [WARNING]commons-cli:commons-cli:jar:1.2:provided [WARNING]org.apache.hadoop:hadoop-yarn-server-common:jar:2.1.2-SNAPSHOT:test [WARNING] Unused declared dependencies found: [WARNING]org.slf4j:slf4j-api:jar:1.7.5:compile [WARNING]org.slf4j:slf4j-log4j12:jar:1.7.5:compile [WARNING]com.google.inject.extensions:guice-servlet:jar:3.0:compile [WARNING]io.netty:netty:jar:3.6.2.Final:compile [WARNING]com.google.protobuf:protobuf-java:jar:2.5.0:compile [WARNING]commons-io:commons-io:jar:2.1:compile [WARNING]org.apache.hadoop:hadoop-hdfs:jar:2.1.2-SNAPSHOT:test [WARNING]com.google.inject:guice:jar:3.0:compile [WARNING] com.sun.jersey.jersey-test-framework:jersey-test-framework-core:jar:1.9:test [WARNING] com.sun.jersey.jersey-test-framework:jersey-test-framework-grizzly2:jar:1.9:compile [WARNING]com.sun.jersey:jersey-server:jar:1.9:compile [WARNING]com.sun.jersey:jersey-json:jar:1.9:compile [WARNING]com.sun.jersey.contribs:jersey-guice:jar:1.9:compile -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: symlink support in Hadoop 2 GA
I encourage interested parties to read through HADOOP-9912 to get a feel for the issues. There really is no way to add symlink support without changing the behavior of existing APIs. Ultimately, anything that returns a FileStatus is going to be different. Even if we default to resolving symlinks, resolving can lead to FileNotFound or permission errors. Thus, we have to choose whether to prune the bad links, show the bad links as dangling, or throwing an exception. None of these options are compatible. I'm really concerned about putting this in a minor release like 2.3 since it has the potential to break a lot of user code. HADOOP-9912 is an example from within our own ecosystem, but think of all the custom user code out there written against FileSystem. 2.2 GA is basically our last chance to make this kind of change before Hadoop 3. Thanks, Andrew On Tue, Sep 17, 2013 at 9:10 AM, Colin McCabe cmcc...@alumni.cmu.eduwrote: The issue is not modifying existing APIs. The issue is that code has been written that makes assumptions that are incompatible with the existence of things that are not files or directories. For example, there is a lot of code out there that looks at FileStatus#isFile, and if it returns false, assumes that what it is looking at is a directory. In the case of a symlink, this assumption is incorrect. Faced with this, we have considered making the default behavior of listStatus and globStatus to be fully resolving symlinks, and simply not listing dangling symlinks. Code which is prepared to deal symlinks can use newer versions of the listStatus and globStatus functions which do return symlinks as symlinks. We might consider defaulting FileSystem#listStatus and FileSystem#globStatus to fully resolving symlinks by default and defaulting FileContext#listStatus and FileContext#Util#globStatus to the opposite. This seems like the maximally compatible solution that we're going to get. I think this makes sense. The alternative is kicking the can down the road to Hadoop 3, and letting vendors of alternative (including some proprietary alternative) systems continue to claim that Hadoop doesn't support symlinks yet (with some justice). P.S. I would be fine with putting this in 2.2 or 2.3 if that seems more appropriate. sincerely, Colin On Tue, Sep 17, 2013 at 8:23 AM, Suresh Srinivas sur...@hortonworks.com wrote: I agree that this is an important change. However, 2.2.0 GA is getting ready to rollout in weeks. I am concerned that these changes will add not only incompatible changes late in the game, but also possibly instability. Java API incompatibility is some thing we have avoided for the most part and I am concerned that this is adding such incompatibility in FileSystem APIs. We should find work arounds by adding possibly newer APIs and leaving existing APIs as is. If this can be done, my vote is to enable this feature in 2.3. Even if it cannot be done, I am concerned that this is coming quite late and we should see if could allow some incompatible changes into 2.3 for this feature. On Mon, Sep 16, 2013 at 6:49 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I wanted to broadcast plans for putting the FileSystem symlinks work (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I think it's pretty important we get it in since it's not a compatible change; if it misses the GA train, we're not going to have symlinks until the next major release. However, we're still dealing with ongoing issues revealed via testing. There's user-code out there that only handles files and directories and will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912 for a nice example where globStatus returning symlinks broke Pig; some of us had a conference call to talk it through, and one definite conclusion was that this wasn't solvable in a generally compatible manner. There are also still some gaps in symlink support right now. For example, the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink resolution, and tooling like the FsShell and Distcp still need to be updated as well. So, there's definitely work to be done, but there are a lot of users interested in the feature, and symlinks really should be in GA. Would appreciate any thoughts/input on the matter. Thanks, Andrew -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error,
[jira] [Created] (HADOOP-9975) Adding relogin() method to UGI
Kai Zheng created HADOOP-9975: - Summary: Adding relogin() method to UGI Key: HADOOP-9975 URL: https://issues.apache.org/jira/browse/HADOOP-9975 Project: Hadoop Common Issue Type: Improvement Reporter: Kai Zheng Assignee: Kai Zheng In current Hadoop UGI implementation, it has API methods like reloginFromKeytab() and reloginFromTicketCache(). However, such methods are too Kerberos specific and also involves login implementation details, it would be better to add generic relogin() method regardless authentication mechanism. This is possible since relevant authentication specific parameters like principal and keytab are already passed and saved in the UGI object after initial login. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
Not sure if this should be a blocker for 2.1.1, but filed HADOOP-9976 to have a single version of avro. On Tue, Sep 17, 2013 at 6:51 AM, Andrew Wang andrew.w...@cloudera.comwrote: Hey all, Sorry to hijack the vote thread, but it'd be good to get some input on my email from yesterday re: symlink support in branch-2.1. I think it really should be in GA one way or the other. http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201309.mbox/%3CCAGB5D2ZDjqt69oFfv_HOsWEH18T9GanuuF1Y%3DaKG-JptvV3ViA%40mail.gmail.com%3E Thanks, Andrew On Tue, Sep 17, 2013 at 2:23 AM, Alejandro Abdelnur t...@cloudera.com wrote: Thanks Arun. +1 * Downloaded source tarball. * Verified MD5 * Verified signature * run apache-rat:check ok after minor tweak (see NIT1 below) * checked CHANGES.txt headers (see NIT2 below) * built DIST from source * verified hadoop version of Hadoop JARs * configured pseudo cluster * tested HttpFS * run a few MR examples * run a few unmanaged AM app examples The following NITs should be addressed if there is a new RC or in the next release -- NIT1, empty files that make apache-rat:check to fail, these files should be removed: * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileContextSymlinkBaseTest.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFSFileContextSymlink.java * /Users/tucu/Downloads/h/hadoop-2.1.1-beta-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestFcHdfsSymlink.java -- NIT2, common/hdfs/mapreduce/yarn CHANGES.txt have 2.2.0 header, they should not -- On Tue, Sep 17, 2013 at 8:38 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro
[jira] [Created] (HADOOP-9976) Different versions of avro and avro-maven-plugin
Karthik Kambatla created HADOOP-9976: Summary: Different versions of avro and avro-maven-plugin Key: HADOOP-9976 URL: https://issues.apache.org/jira/browse/HADOOP-9976 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Post HADOOP-9672, the versions for avro and avro-maven-plugin are different - 1.7.4 and 1.5.3 respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9977) Hadoop services won't start with different keypass and keystorepass when https is enabled
Yesha Vora created HADOOP-9977: -- Summary: Hadoop services won't start with different keypass and keystorepass when https is enabled Key: HADOOP-9977 URL: https://issues.apache.org/jira/browse/HADOOP-9977 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Enable ssl in the configuration. While creating keystore, give different keypass and keystore password. (here, keypass = hadoop and storepass=hadoopKey) keytool -genkey -alias host1 -keyalg RSA -keysize 1024 -dname CN=host1,OU=hw,O=hw,L=palo alto,ST=ca,C=us -keypass hadoop -keystore keystore.jks -storepass hadoopKey In , ssl-server.xml set below two properties. propertynamessl.server.keystore.keypassword/namevaluehadoop/value/property propertynamessl.server.keystore.password/namevaluehadoopKey/value/property Namenode, ResourceManager, Datanode, Nodemanager, SecondaryNamenode fails to start with below error. 2013-09-17 21:39:00,794 FATAL namenode.NameNode (NameNode.java:main(1325)) - Exception in namenode join java.io.IOException: java.security.UnrecoverableKeyException: Cannot recover key at org.apache.hadoop.http.HttpServer.init(HttpServer.java:222) at org.apache.hadoop.http.HttpServer.init(HttpServer.java:174) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer$1.init(NameNodeHttpServer.java:76) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:74) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:626) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:488) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320) Caused by: java.security.UnrecoverableKeyException: Cannot recover key at sun.security.provider.KeyProtector.recover(KeyProtector.java:328) at sun.security.provider.JavaKeyStore.engineGetKey(JavaKeyStore.java:138) at sun.security.provider.JavaKeyStore$JKS.engineGetKey(JavaKeyStore.java:55) at java.security.KeyStore.getKey(KeyStore.java:792) at sun.security.ssl.SunX509KeyManagerImpl.init(SunX509KeyManagerImpl.java:131) at sun.security.ssl.KeyManagerFactoryImpl$SunX509.engineInit(KeyManagerFactoryImpl.java:68) at javax.net.ssl.KeyManagerFactory.init(KeyManagerFactory.java:259) at org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:170) at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:121) at org.apache.hadoop.http.HttpServer.init(HttpServer.java:220) ... 9 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: symlink support in Hadoop 2 GA
(Looping in Arun since this impacts 2.x releases) I updated the versions on HADOOP-8040 and sub-tasks to reflect where the changes have landed. All of these changes (modulo HADOOP-9417) were merged to branch-2.1 and are in the 2.1.0 release. While symlinks are in 2.1.0 I don't think we can really claim they're ready until issues like HADOOP-9912 are resolved, and they are supported in the shell, distcp and WebHDFS/HttpFS/Hftp (these are not esoteric!). Someone can create a symlink with FileSystem causing someone else's distcp job to fail. Unlikely given they're not exposed outside the Java API but still not great. Ideally this work would have been done on a feature branch and then merged when complete, but that's water under the bridge. I see the following options: 1. Fixup the current symlink support so that symlinks are ready for 2.2 (GA), or at least the public APIs. This means the APIs will be in GA from the get go so while the functionality might be fully baked we don't have to worry about incompatible changes like FileStatus#isDir() changing behavior in 2.3 or a later update. The downside is this will take at least a couple weeks (to resolve HADOOP-9912 and potentially implement the remaining pieces) and so may impact the 2.2 release timing. This option means 2.2 won't remove the new APIs introduced in 2.1. We'd want to spin a 2.1.2 beta with the new API changes so we don't introduce new APIs in the beta to GA transition. 2. Revert symlinks from branch-2.1-beta and branch-2. Finish up the work in trunk (or a feature branch) and merge for a subsequent 2.x update. While this helps get us to GA faster it would be preferable to get an API change like this in for 2.2 GA since they may be disruptive to introduce in an update (eg see example in #1). And of course our users would like symlinks functionality in the GA release. This option would mean 2.2 is incompatible with 2.1 because it's dropping the new APIs, not ideal for a beta to GA transition. 3. Revert and punt symlinks to 3.x. IMO should be the last resort. If we have sufficient time I think option #1 would be best. What do others think? Thanks, Eli On Mon, Sep 16, 2013 at 6:49 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I wanted to broadcast plans for putting the FileSystem symlinks work (HADOOP-8040) into branch-2.1 for the pending Hadoop 2 GA release. I think it's pretty important we get it in since it's not a compatible change; if it misses the GA train, we're not going to have symlinks until the next major release. However, we're still dealing with ongoing issues revealed via testing. There's user-code out there that only handles files and directories and will barf when given a symlink (perhaps a dangling one!). See HADOOP-9912 for a nice example where globStatus returning symlinks broke Pig; some of us had a conference call to talk it through, and one definite conclusion was that this wasn't solvable in a generally compatible manner. There are also still some gaps in symlink support right now. For example, the more esoteric FileSystems like WebHDFS, HttpFS, and HFTP need symlink resolution, and tooling like the FsShell and Distcp still need to be updated as well. So, there's definitely work to be done, but there are a lot of users interested in the feature, and symlinks really should be in GA. Would appreciate any thoughts/input on the matter. Thanks, Andrew
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
On Mon, Sep 16, 2013 at 11:38 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. As usual, here's a full Bigtop stack built on top of this RC: http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Hadoop-2.1.1/ Those who would feel more adventurous in helping test the entire Hadoop ecosystem that Bigtop packages can simply yum/apt-get/zypper install everything from package in those locations. I'm also running the tests on fully distributed clusters in Bigtop -- will report the findings tomorrow. Thanks, Roman.
[jira] [Created] (HADOOP-9978) Support range reads in s3n interface to split objects for mappers to read
Amandeep Khurana created HADOOP-9978: Summary: Support range reads in s3n interface to split objects for mappers to read Key: HADOOP-9978 URL: https://issues.apache.org/jira/browse/HADOOP-9978 Project: Hadoop Common Issue Type: Improvement Reporter: Amandeep Khurana -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira