[jira] [Commented] (HADOOP-8290) Remove two remaining references to hadoop.native.lib oldprop
[ https://issues.apache.org/jira/browse/HADOOP-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256594#comment-13256594 ] Harsh J commented on HADOOP-8290: - Failing view-fs trash test is unrelated to this patch. Remove two remaining references to hadoop.native.lib oldprop -- Key: HADOOP-8290 URL: https://issues.apache.org/jira/browse/HADOOP-8290 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8290.patch, HADOOP-8290.patch The following two test files still carry the old prop name: {code} # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java {code} This JIRA is to merely fix those up to use the new io.native.lib.enabled prop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8290) Remove two remaining references to hadoop.native.lib oldprop
[ https://issues.apache.org/jira/browse/HADOOP-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256789#comment-13256789 ] Harsh J commented on HADOOP-8290: - Sorry, committed with another change already in, reverting and recommitting properly. Remove two remaining references to hadoop.native.lib oldprop -- Key: HADOOP-8290 URL: https://issues.apache.org/jira/browse/HADOOP-8290 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8290.patch, HADOOP-8290.patch The following two test files still carry the old prop name: {code} # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java {code} This JIRA is to merely fix those up to use the new io.native.lib.enabled prop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8290) Remove two remaining references to hadoop.native.lib oldprop
[ https://issues.apache.org/jira/browse/HADOOP-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256798#comment-13256798 ] Harsh J commented on HADOOP-8290: - Well, damn. Blooper after blooper - I hate SVN. Gimme a few mins to clean this up properly. Remove two remaining references to hadoop.native.lib oldprop -- Key: HADOOP-8290 URL: https://issues.apache.org/jira/browse/HADOOP-8290 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-8290.patch, HADOOP-8290.patch The following two test files still carry the old prop name: {code} # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java {code} This JIRA is to merely fix those up to use the new io.native.lib.enabled prop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8290) Remove two remaining references to hadoop.native.lib oldprop
[ https://issues.apache.org/jira/browse/HADOOP-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256809#comment-13256809 ] Harsh J commented on HADOOP-8290: - Done. Removed the /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml changes that sneaked in in earlier commits but didn't get reverted cause of a secondary mistake. Should be resolved now. Sorry for much noise! I've now setup svn plugins locally that indicates my svn dir is clean or dirty, so I don't repeat this. Remove two remaining references to hadoop.native.lib oldprop -- Key: HADOOP-8290 URL: https://issues.apache.org/jira/browse/HADOOP-8290 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 3.0.0 Attachments: HADOOP-8290.patch, HADOOP-8290.patch The following two test files still carry the old prop name: {code} # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java # modified: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java {code} This JIRA is to merely fix those up to use the new io.native.lib.enabled prop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8273) Update url for commons daemon ppc64 binary tarball
[ https://issues.apache.org/jira/browse/HADOOP-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253050#comment-13253050 ] Harsh J commented on HADOOP-8273: - Ah sorry about that then Ravi - It slipped my mind that we can provide a downstream artifact too. Update url for commons daemon ppc64 binary tarball -- Key: HADOOP-8273 URL: https://issues.apache.org/jira/browse/HADOOP-8273 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 1.0.2, 1.0.3 Environment: RHEL 6.1 on PowerPC with IBM Java 6.0 SR10 Reporter: Kumar Ravi The following error message was seen while attempting to build branch-1 on PowerPC. [get] Error opening connection java.io.FileNotFoundException: http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz [get] Error opening connection java.io.FileNotFoundException: http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz [get] Error opening connection java.io.FileNotFoundException: http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz [get] Can't get http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz to /home/hadoop/branch-1/build/jsvc.ppc64/jsvc.ppc64.tar.gz BUILD FAILED /home/hadoop/branch-1_040612/build.xml:1606: The following error occurred while executing this line: /home/hadoop/branch-1_040612/build.xml:2804: Can't get http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz to /home/hadoop/branch-1/build/jsvc.ppc64/jsvc.ppc64.tar.gz There is no commons-daemon-1.0.2-bin-linux-ppc64.tar.gz available at the above URL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8197) Configuration logs WARNs on every use of a deprecated key
[ https://issues.apache.org/jira/browse/HADOOP-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235685#comment-13235685 ] Harsh J commented on HADOOP-8197: - +1, patch looks good. One optional nitpick though: Would {{warnOnceIfDeprecated}} be a better method name? Configuration logs WARNs on every use of a deprecated key - Key: HADOOP-8197 URL: https://issues.apache.org/jira/browse/HADOOP-8197 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.23.3 Attachments: HADOOP-8197.patch The logic to do print a warning only once per deprecated key does not work: {code} 2012-03-21 22:32:58,121 WARN Configuration:661 - user.name is deprecated. Instead, use mapreduce.job.user.name 2012-03-21 22:32:58,123 WARN Configuration:661 - fs.default.name is deprecated. Instead, use fs.defaultFS ... 2012-03-21 22:32:58,130 WARN Configuration:661 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2012-03-21 22:32:58,351 WARN Configuration:345 - fs.default.name is deprecated. Instead, use fs.defaultFS ... 2012-03-21 22:32:58,843 WARN Configuration:661 - user.name is deprecated. Instead, use mapreduce.job.user.name 2012-03-21 22:32:58,844 WARN Configuration:661 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2012-03-21 22:32:58,844 WARN Configuration:661 - fs.default.name is deprecated. Instead, use fs.defaultFS {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6924) Build fails with non-Sun JREs due to different pathing to the operating system architecture shared libraries
[ https://issues.apache.org/jira/browse/HADOOP-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233175#comment-13233175 ] Harsh J commented on HADOOP-6924: - Hi, Can you please also add the appropriate fix versions? Build fails with non-Sun JREs due to different pathing to the operating system architecture shared libraries Key: HADOOP-6924 URL: https://issues.apache.org/jira/browse/HADOOP-6924 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.0, 0.20.1, 0.20.2, 0.21.0 Environment: SLES 10, IBM Java 6 Reporter: Stephen Watt Assignee: Devaraj Das Attachments: 6924-1.patch, 6924-2.branch-1.patch, 6924-2.patch, HADOOP-6924-v2.patch, HADOOP-6924.patch, apjvmlibdir.m4 The src/native/configure script used to build the native libraries has an environment variable called JNI_LDFLAGS which is set as follows: JNI_LDFLAGS=-L$JAVA_HOME/jre/lib/$OS_ARCH/server This pathing convention to the shared libraries for the operating system architecture is unique to Oracle/Sun Java and thus on other flavors of Java the path will not exist and will result in a build failure with the following exception: [exec] gcc -shared ../src/org/apache/hadoop/io/compress/zlib/.libs/ZlibCompressor.o ../src/org/apache/hadoop/io/compress/zlib/.libs/ZlibDecompressor.o -L/home/hadoop/Java-Versions/ibm-java-i386-60/jre/lib/x86/server -ljvm -ldl -m32 -m32 -Wl,-soname -Wl,libhadoop.so.1 -o .libs/libhadoop.so.1.0.0 [exec] /usr/lib/gcc/i586-suse-linux/4.1.2/../../../../i586-suse-linux/bin/ld: cannot find -ljvm [exec] collect2: ld returned 1 exit status -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8152) Expand public APIs for security library classes
[ https://issues.apache.org/jira/browse/HADOOP-8152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225080#comment-13225080 ] Harsh J commented on HADOOP-8152: - We also have public docs on letting users use UserGroupInformation for secure impersonation, at http://hadoop.apache.org/common/docs/r1.0.0/Secure_Impersonation.html (for example). Expand public APIs for security library classes --- Key: HADOOP-8152 URL: https://issues.apache.org/jira/browse/HADOOP-8152 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 0.23.3 Reporter: Aaron T. Myers Currently projects like Hive and HBase use UserGroupInformation and SecurityUtil methods. Both of these classes are marked LimitedPrivate(HDFS,MR) but should probably be marked more generally public. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8094) Make maven-eclipse-plugin use the spring project nature
[ https://issues.apache.org/jira/browse/HADOOP-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212000#comment-13212000 ] Harsh J commented on HADOOP-8094: - Patch does apply but one file is in root of project, the other is in HDFS - QA can't apply this. Make maven-eclipse-plugin use the spring project nature --- Key: HADOOP-8094 URL: https://issues.apache.org/jira/browse/HADOOP-8094 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.0 Reporter: Harsh J Assignee: Harsh J Labels: eclipse, maven Attachments: HADOOP-8094.patch If I want to have multiple versions of Apache Hadoop loaded into my Eclipse IDE today (or any other IDE maybe), I'm supposed to do the following when generating eclipse files, such that the version name is appended to the project name and thereby resolves conflict in project names when I import another version in: {{mvn -Declipse.addVersionToProjectName=true eclipse:eclipse}} But this does not work presently due to a lack of configuration in Apache Hadoop, which https://jira.codehaus.org/browse/MECLIPSE-702 demands. The problem being that though the project names are indeed named with version suffixes, the related project name it carries for dependencies do not carry the same suffix and therefore you have a broken import of projects errors everywhere about 'dependent project regularname not found'. The fix is as Carlo details on https://jira.codehaus.org/browse/MECLIPSE-702 and it works perfectly. I'll attach a patch adding in the same configuration for Apache Hadoop so that the above mechanism is then possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8094) Make maven-eclipse-plugin use the spring project nature
[ https://issues.apache.org/jira/browse/HADOOP-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212332#comment-13212332 ] Harsh J commented on HADOOP-8094: - (Btw, patch does not change the default eclipse:eclipse behavior - just adds the spring nature for those who want to leverage it.) Make maven-eclipse-plugin use the spring project nature --- Key: HADOOP-8094 URL: https://issues.apache.org/jira/browse/HADOOP-8094 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.0 Reporter: Harsh J Assignee: Harsh J Labels: eclipse, maven Attachments: HADOOP-8094.patch If I want to have multiple versions of Apache Hadoop loaded into my Eclipse IDE today (or any other IDE maybe), I'm supposed to do the following when generating eclipse files, such that the version name is appended to the project name and thereby resolves conflict in project names when I import another version in: {{mvn -Declipse.addVersionToProjectName=true eclipse:eclipse}} But this does not work presently due to a lack of configuration in Apache Hadoop, which https://jira.codehaus.org/browse/MECLIPSE-702 demands. The problem being that though the project names are indeed named with version suffixes, the related project name it carries for dependencies do not carry the same suffix and therefore you have a broken import of projects errors everywhere about 'dependent project regularname not found'. The fix is as Carlo details on https://jira.codehaus.org/browse/MECLIPSE-702 and it works perfectly. I'll attach a patch adding in the same configuration for Apache Hadoop so that the above mechanism is then possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207160#comment-13207160 ] Harsh J commented on HADOOP-8055: - Mahadev, From hadoop-assemblies' {{assemblies/hadoop-dist.xml}} all the target does is take each project's src/main/conf and place it inside etc/conf. This change is inline with empty configs inside other projects (hdfs, httpfs for example) and these reflect inside the tar already. Unsure how yarn gets them in, however, cause they do not carry such a dir in any of their projects. Perhaps that's why they don't appear under etc/hadoop? Is the template sourcing is only done for packages, and not the tar assembly? In any case, we'd have to pick one method up and use it for all projects. Please do reopen this issue if this current way of injecting core-site.xml is incorrect! Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Assignee: Harsh J Fix For: 0.24.0, 0.23.2 Attachments: HADOOP-8055.patch, HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207163#comment-13207163 ] Harsh J commented on HADOOP-8055: - -Perhaps that's why they don't appear under etc/hadoop?- These are all I see in etc/hadoop in a tarball (from an earlier report of mine), and I do recall hdfs/httpfs being empty as well -- didn't get around checking yarn back then: hadoop-metrics.properties hadoop-metrics2.properties hdfs-site.xml httpfs-env.sh httpfs-log4j.properties httpfs-signature.secret httpfs-site.xml log4j.properties slaves ssl-client.xml.example ssl-server.xml.example yarn-env.sh yarn-site.xml Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Assignee: Harsh J Fix For: 0.24.0, 0.23.2 Attachments: HADOOP-8055.patch, HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206123#comment-13206123 ] Harsh J commented on HADOOP-8055: - Eric - The absence of a core-site.xml means the core-default.xml alone is loaded, and it has an fs.defaultFS in it set to file:/// OOB. However, I agree our log messages, and our guards can be even better. Could you file another JIRA with your suggestions? I'll be happy discuss and carry it over from there :) Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Assignee: Harsh J Fix For: 0.24.0, 0.23.2 Attachments: HADOOP-8055.patch, HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205390#comment-13205390 ] Harsh J commented on HADOOP-8055: - Also, when using git, simply do: {{git diff --no-prefix}} to generate SVN compatible patches. I have a git alias called sdiff (short for svn diff) that I reuse to make it easier. So for all apache patches I go by a {{git sdiff ~/PROJ-ID.patch}} and thats good enough :) Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Attachments: HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205388#comment-13205388 ] Harsh J commented on HADOOP-8055: - Noticed the same thing when playing with CDH4b1 assemblies of 0.23.x. Thanks for reporting and filing a patch Eric! The idea of the patch is right but the location is wrong. Based on assemblies from hadoop-assemblies, from what little I know of mvn, it seems we assemble the etc/hadoop bits from each projects' src/main/conf directory. Sure enough, hadoop-common's src/main/conf is empty. We need the core-site.xml created there for an ideal fix, and it will get assembled. Could you refactor your patch to do this instead of using hdfs? Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Attachments: HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205415#comment-13205415 ] Harsh J commented on HADOOP-8055: - The code I was talking about is around line 34 of hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml Simple fix is to move your template core-site.xml to hadoop-common-project/hadoop-common/src/main/conf and I'll run the build and commit it in. I tested this fix locally, seems to work well and produce the file, but I'll have to run tests via hudson to know for sure. Regarding the patch, if the patch is not -p0 applicable, Hadoop QA buildbot will fail to apply it cleanly for testing (When we run it via Submit Patch from the JIRA), and hence it is required that patches be svn-compatible :( Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Attachments: HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205416#comment-13205416 ] Harsh J commented on HADOOP-8055: - Eric, bq. (btw, where can newbies view a picture of the branches and their relations with trunk?) Its odd that we don't have a resource of our own but check out https://blogs.apache.org/bigtop/entry/all_you_wanted_to_know and http://www.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/ -- should help clear all the confusion created with the half-baked rename of branches lately. Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Attachments: HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205539#comment-13205539 ] Harsh J commented on HADOOP-8055: - Tom - It would not resolve the issue if we just create the file, but I think what Eric meant to say was that he wanted to resolve it by adding fs.defaultFS and he could not find core-site.xml, and hence had to create it on his own. Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Assignee: Harsh J Attachments: HADOOP-8055.patch, HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8056) Configuration doesn't pass empty string values to tasks
[ https://issues.apache.org/jira/browse/HADOOP-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205564#comment-13205564 ] Harsh J commented on HADOOP-8056: - Luca, Could you also tell us if this is a regression, or a bug report as you feel the behavior should be different? Configuration doesn't pass empty string values to tasks --- Key: HADOOP-8056 URL: https://issues.apache.org/jira/browse/HADOOP-8056 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.20.2, 1.0.0 Reporter: Luca Pireddu If I assign an *empty string* as a value to a property in a JobConf 'job' while I'm preparing it to run, the Configuration does store that value. I can retrieve it later while in the same process and the value is maintained. However, if then call JobClient.runJob(job), the Configuration that is received by the Map and Reduce tasks doesn't contain the property, and calling JobConf.get with that property name returns null (instead of an empty string). Futher, if I inspect the job's configuration via Hadoop's web interface, the property isn't present. It seems as if whatever serialization mechanism that is used to transmit the Configuration from the job client to the tasks discards properties with value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8045) org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well
[ https://issues.apache.org/jira/browse/HADOOP-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204360#comment-13204360 ] Harsh J commented on HADOOP-8045: - Tarjei, Perhaps you may want to increase the xceiver limits on your DN to allow more writes? org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well -- Key: HADOOP-8045 URL: https://issues.apache.org/jira/browse/HADOOP-8045 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.21.0, 1.0.0 Environment: Cloudera CH3 release. Reporter: Tarjei Huse Labels: patch Attachments: hadoop-multiple-outputs.patch We were tryong to use MultipleOutputs to write one file per key. This produced the error: exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/me/part6/_temporary/ _attempt_201202071305_0017_r_00_2/2011-11-18-22- attempt_201202071305_0017_r_00_2-r-0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 1520) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java: 665) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 1157) When the nr. of files processed increased over 20 on a single developer system. The solution proved to be to close each RecordWriter when the reducer was finished with a key, something that required that we extended the multiple outputs to fetch the recordwriter - not a good solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8045) org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well
[ https://issues.apache.org/jira/browse/HADOOP-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204468#comment-13204468 ] Harsh J commented on HADOOP-8045: - You should get them in the DN logs (count exceeded form of messages). Are they not present? org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well -- Key: HADOOP-8045 URL: https://issues.apache.org/jira/browse/HADOOP-8045 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.21.0, 1.0.0 Environment: Cloudera CH3 release. Reporter: Tarjei Huse Labels: patch Attachments: hadoop-multiple-outputs.patch We were tryong to use MultipleOutputs to write one file per key. This produced the error: exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/me/part6/_temporary/ _attempt_201202071305_0017_r_00_2/2011-11-18-22- attempt_201202071305_0017_r_00_2-r-0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 1520) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java: 665) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 1157) When the nr. of files processed increased over 20 on a single developer system. The solution proved to be to close each RecordWriter when the reducer was finished with a key, something that required that we extended the multiple outputs to fetch the recordwriter - not a good solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8045) org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well
[ https://issues.apache.org/jira/browse/HADOOP-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204582#comment-13204582 ] Harsh J commented on HADOOP-8045: - Ok, thats a little odd. But the NN does exclude DNs based on # of transfer threads load. That is what is affecting you -- error at DN or not, cause of 120 requests of write per task (sure you want small files?). You could also raise your settings 2x and try to see if it elevates or goes away. In any case, I'm +1 on adding a specific closing API to MultipleOutputs to close a given named output. Can you however, add it to the mapred.lib.MultipleOutputs (Stable API) as well? Comments on existing patch btw: * The javadoc can actually reside over the new function you've added. Something like This func is useful in reducers where after writing a particular key as an output, you may close it to save on fs connections. * Once closed, the writer must be moved out of the collection. * New addition requires test cases, as nothing covers this API call right now. Please add a test case that tests your new method. There are existing tests inside of TestMultipleOutputs (Stable API - you need to add), and TestMRMultipleOutputs (Unstable, new API - your patch). Thanks! org.apache.hadoop.mapreduce.lib.output.MultipleOutputs does not handle many files well -- Key: HADOOP-8045 URL: https://issues.apache.org/jira/browse/HADOOP-8045 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.21.0, 1.0.0 Environment: Cloudera CH3 release. Reporter: Tarjei Huse Labels: patch Attachments: hadoop-multiple-outputs.patch We were tryong to use MultipleOutputs to write one file per key. This produced the error: exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/me/part6/_temporary/ _attempt_201202071305_0017_r_00_2/2011-11-18-22- attempt_201202071305_0017_r_00_2-r-0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 1520) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java: 665) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 1157) When the nr. of files processed increased over 20 on a single developer system. The solution proved to be to close each RecordWriter when the reducer was finished with a key, something that required that we extended the multiple outputs to fetch the recordwriter - not a good solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8024) Debugging Hadoop daemons in Eclipse / Netbeans debugger
[ https://issues.apache.org/jira/browse/HADOOP-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201035#comment-13201035 ] Harsh J commented on HADOOP-8024: - Hi, Yes, getting daemons to run inside of an IDE is a matter of having the env. properly set. Will you be filing a doc patch or contributing a wiki page dedicated to this? IIRC you could also do this via 'Run Configuration' integration (We had that for 0.21/0.22 I think?), or other better techniques, so that the dev does not have to set the runners up himself. Debugging Hadoop daemons in Eclipse / Netbeans debugger --- Key: HADOOP-8024 URL: https://issues.apache.org/jira/browse/HADOOP-8024 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 0.20.203.0 Environment: Windows/Linux; Eclipse and Netbeans Reporter: Ambud Sharma Priority: Minor Labels: documentation Original Estimate: 0h Remaining Estimate: 0h To debug Hadoop daemons for prototyping; I discovered the use of Eclipse and Netbeans for the purpose. To perform this import the source code of Hadoop as a Java Project (you will have to do some refactoring to make sure the imports and packages are correct) instead of an ant project or importing an existing project. Please make sure the apache-commons libraries are in classpath and you should now be able to launch the particular daemon from their respective packages inside the debugger. I find it particularly useful to use the eclipse and netbeans standard debugger to perform any code increments as its fairly simple to perform stacktrace and point to the exact code error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8004) Multiple SLF4J binding message in .out file for all daemons
[ https://issues.apache.org/jira/browse/HADOOP-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196111#comment-13196111 ] Harsh J commented on HADOOP-8004: - Hi, Joe's proposed to post a patch for this at HDFS-2599 (now HADOOP-8005 cause it targets all subprojects apparently) where it was initially reported. Please post your reviews on that? Closing this as a dupe. Multiple SLF4J binding message in .out file for all daemons --- Key: HADOOP-8004 URL: https://issues.apache.org/jira/browse/HADOOP-8004 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Devaraj K {code:xml} SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/install/hadoop-0.23.1-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/install/hadoop-0.23.1-SNAPSHOT/share/hadoop/hdfs/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/install/hadoop-0.23.1-SNAPSHOT/share/hadoop/mapreduce/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7949) Updated maxIdleTime default in the code to match core-default.xml
[ https://issues.apache.org/jira/browse/HADOOP-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196691#comment-13196691 ] Harsh J commented on HADOOP-7949: - Eli, Is this pending a 0.23 backport? I do not see a Fix Version(s) set. Updated maxIdleTime default in the code to match core-default.xml - Key: HADOOP-7949 URL: https://issues.apache.org/jira/browse/HADOOP-7949 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Priority: Trivial Attachments: hadoop-7949.txt, hadoop-7949.txt HADOOP-2909 intended to set the server max idle time for a connection to twice the client value. (The server-side max idle time should be greater than the client-side max idle time, for example, twice of the client-side max idle time.) This way when a server times out a connection it's due a crashed client and not an inactive client so we don't close client connections with outstanding requests (by setting 2x the client value on the server side the client should time out the connection first). Looks like there was a typo in the patch and it set the default value to 1/5th the client value, instead of the intended 2x. {noformat} hadoop2 (pre-HADOOP-4687)$ git reset --hard 6fa4597e hadoop2 (pre-HADOOP-4687)$ grep -r ipc.client.connection.maxidletime . ./src/core/org/apache/hadoop/ipc/Client.java: conf.getInt(ipc.client.connection.maxidletime, 1); //10s ./src/core/org/apache/hadoop/ipc/Server.java:this.maxIdleTime = 2*conf.getInt(ipc.client.connection.maxidletime, 1000); {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7992) Add ZK client for leader election
[ https://issues.apache.org/jira/browse/HADOOP-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193150#comment-13193150 ] Harsh J commented on HADOOP-7992: - Just wondering if Curator would've been a better choice here, instead of writing our own? Add ZK client for leader election - Key: HADOOP-7992 URL: https://issues.apache.org/jira/browse/HADOOP-7992 Project: Hadoop Common Issue Type: Sub-task Components: ha Reporter: Suresh Srinivas Assignee: Bikas Saha Fix For: HA Branch (HDFS-1623) Attachments: HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.HDFS-1623.patch, HDFS-2681.txt, HDFS-2681.txt, Zookeeper based Leader Election and Monitoring Library.pdf ZKClient needs to support the following capabilities: # Ability to create a znode for co-ordinating leader election. # Ability to monitor and receive call backs when active znode status changes. # Ability to get information about the active node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7795) add -n option for FSshell cat
[ https://issues.apache.org/jira/browse/HADOOP-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190617#comment-13190617 ] Harsh J commented on HADOOP-7795: - Xie, Have you ever needed this? How often have you come across data files on HDFS that you needed to get line number references of? Given other tools out there already do this and -cat output can interface with them (well maybe a little inefficient but don't think it'd even matter), I'd not want to rebuild the whole of coreutils' cat featureset inside Hadoop. I'm inclined to close this as won't fix unless I hear strong reasons on why we ought to be building this in. add -n option for FSshell cat - Key: HADOOP-7795 URL: https://issues.apache.org/jira/browse/HADOOP-7795 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.24.0 Reporter: XieXianshan Assignee: XieXianshan Fix For: 0.24.0 Attachments: HADOOP-7795.patch Add -n option for cat to display the files with line numbers.It's quite useful if you're reading big files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7510) Tokens should use original hostname provided instead of ip
[ https://issues.apache.org/jira/browse/HADOOP-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190618#comment-13190618 ] Harsh J commented on HADOOP-7510: - Please fix this bit in HftpFileSystem as well, if still relevant: {code} //TODO: un-comment the following once HDFS-7510 is committed. //return getConf().getInt(DFSConfigKeys.DFS_NAMENODE_HTTP_PORT_KEY, //DFSConfigKeys.DFS_NAMENODE_HTTP_PORT_DEFAULT); } {code} Tokens should use original hostname provided instead of ip -- Key: HADOOP-7510 URL: https://issues.apache.org/jira/browse/HADOOP-7510 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.20.205.0 Attachments: HADOOP-7510-10.patch, HADOOP-7510-11.patch, HADOOP-7510-12.patch, HADOOP-7510-2.patch, HADOOP-7510-3.patch, HADOOP-7510-4.patch, HADOOP-7510-5.patch, HADOOP-7510-6.patch, HADOOP-7510-8.patch, HADOOP-7510-9.patch, HADOOP-7510.patch Tokens currently store the ip:port of the remote server. This precludes tokens from being used after a host's ip is changed. Tokens should store the hostname used to make the RPC connection. This will enable new processes to use their existing tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188908#comment-13188908 ] Harsh J commented on HADOOP-1381: - Todd/others, are there any other comments you'd like me to address? The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
[ https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188915#comment-13188915 ] Harsh J commented on HADOOP-6801: - Hey folks, can someone pitch in and give this mostly-docfix a quick review? io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java --- Key: HADOOP-6801 URL: https://issues.apache.org/jira/browse/HADOOP-6801 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.22.0 Reporter: Erik Steffl Assignee: Harsh J Priority: Minor Attachments: HADOOP-6801.r1.diff, HADOOP-6801.r2.diff Following configuration keys in CommonConfigurationKeysPublic.java (former CommonConfigurationKeys.java): public static final String IO_SORT_MB_KEY = io.sort.mb; public static final String IO_SORT_FACTOR_KEY = io.sort.factor; are partially moved: - they were renamed to mapreduce.task.io.sort.mb and mapreduce.task.io.sort.factor respectively - they were moved to mapreduce project, documented in mapred-default.xml However: - they are still listed in CommonConfigurationKeysPublic.java as quoted above - strings io.sort.mb and io.sort.factor are used in SequenceFile.java in Hadoop Common project Not sure what the solution is, these constants should probably be removed from CommonConfigurationKeysPublic.java but I am not sure what's the best solution for SequenceFile.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7968) Errant println left in RPC.getHighestSupportedProtocol
[ https://issues.apache.org/jira/browse/HADOOP-7968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186211#comment-13186211 ] Harsh J commented on HADOOP-7968: - Sho, Change is fine, but please wrap it in {{if (Log.isDebugEnabled()) {…}}} condition. Thanks! Errant println left in RPC.getHighestSupportedProtocol -- Key: HADOOP-7968 URL: https://issues.apache.org/jira/browse/HADOOP-7968 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 0.24.0 Reporter: Todd Lipcon Assignee: Sho Shimauchi Priority: Minor Labels: newbie Attachments: HADOOP-7968.txt hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java: System.out.println(Size of protoMap for + rpcKind + = + getProtocolImplMap(rpcKind).size()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7974) TestViewFsTrash incorrectly determines the user's home directory
[ https://issues.apache.org/jira/browse/HADOOP-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186410#comment-13186410 ] Harsh J commented on HADOOP-7974: - (Also fixed some diff-surrounding whitesp. and indentation issues) TestViewFsTrash incorrectly determines the user's home directory Key: HADOOP-7974 URL: https://issues.apache.org/jira/browse/HADOOP-7974 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Harsh J Attachments: HADOOP-7974.patch HADOOP-7284 added a test called TestViewFsTrash which contains the following code to determine the user's home directory. It only works if the user's directory is one level deep, and breaks if the home directory is more than one level deep (eg user hudson, who's home dir might be /usr/lib/hudson instead of /home/hudson). {code} // create a link for home directory so that trash path works // set up viewfs's home dir root to point to home dir root on target // But home dir is different on linux, mac etc. // Figure it out by calling home dir on target String homeDir = fsTarget.getHomeDirectory().toUri().getPath(); int indexOf2ndSlash = homeDir.indexOf('/', 1); String homeDirRoot = homeDir.substring(0, indexOf2ndSlash); ConfigUtil.addLink(conf, homeDirRoot, fsTarget.makeQualified(new Path(homeDirRoot)).toUri()); ConfigUtil.setHomeDirConf(conf, homeDirRoot); Log.info(Home dir base + homeDirRoot); {code} Seems like we should instead search from the end of the path for the last slash and use that as the base, ie ask the home directory for its parent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7973) DistributedFileSystem close has severe consequences
[ https://issues.apache.org/jira/browse/HADOOP-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186424#comment-13186424 ] Harsh J commented on HADOOP-7973: - (Coming in late, but just to add my thoughts…) I've seen this bite users as well but its more so cause they do not understand how to use the FS objects than anything else: bq. A MR task that uses FsShell. The shell opens a DFS, performs it's action, and the shell will close the DFS. Now the MR input stream close to that same fileystem will fail. Is FsShell a publicly supported API now? bq. User map task code that opens the default filesystem and subsequently closes it. MR input stream close will fail. Users should not be closing the FS handle. Users shall open/close streams they use. Is that not the right thing to do? DistributedFileSystem close has severe consequences --- Key: HADOOP-7973 URL: https://issues.apache.org/jira/browse/HADOOP-7973 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 1.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HADOOP-7973.patch The way {{FileSystem#close}} works is very problematic. Since the {{FileSystems}} are cached, any {{close}} by any caller will cause problems for every other reference to it. Will add more detail in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7959) Contradiction in Hadoop Documentation
[ https://issues.apache.org/jira/browse/HADOOP-7959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186433#comment-13186433 ] Harsh J commented on HADOOP-7959: - Great catch Bryan. Would you mind contributing a doc-fix patch that fixes the Progressable statement? Thanks, Harsh Contradiction in Hadoop Documentation - Key: HADOOP-7959 URL: https://issues.apache.org/jira/browse/HADOOP-7959 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 1.0.0 Environment: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Progressable.html Reporter: Bryan Halfpap Priority: Minor Labels: Documentation Original Estimate: 1h Remaining Estimate: 1h http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Progressable.html The statement: This is especially important for operations which take an insignificant amount of time since, in-lieu of the reported progress, the framework has to assume that an error has occured and time-out the operation. in the aforementioned URL directly contradicts http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Reporter which states: where the application takes a significant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The two statements should be reconciled in order to remove confusion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7956) Remove duplicate definition of default config values
[ https://issues.apache.org/jira/browse/HADOOP-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186434#comment-13186434 ] Harsh J commented on HADOOP-7956: - I like #2. #1 has a chance of breaking some behavior, given our _reconfiguration_ history. One I can think of, and came up recently from Koji on the ML, is the MR child opts vs. map/reduce specific child opts -- if both get present in default.xml, then it voids the former, breaking behavior even if the user wants to specify it. Remove duplicate definition of default config values Key: HADOOP-7956 URL: https://issues.apache.org/jira/browse/HADOOP-7956 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Eli Collins We define default configuration values in two places: #1 The default.xml files (eg core-default.xml) #2 The _DEFAULT defines in *Keys.java This means the defaults used in the code may or may not be dead code based on whether the config is present in the default xml file. Would be good to define these in one place. Eg: #1 Just have the defines in the code and figure out how to make those accessible as a loadable resource (eg could generate the default files from the defines in the KeysPublic* files) #2 Remove one of the definitions entirely (possible to live w/o the default files?) or #3 Remove the overlap between the code and default files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7943) DFS shell get/copy gives weird errors when permissions are wrong with directories
[ https://issues.apache.org/jira/browse/HADOOP-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186437#comment-13186437 ] Harsh J commented on HADOOP-7943: - Ben, would you interested to contribute a find and file a fix patch for this? For 1.0 (formerly 0.20.205), the relevant class/filename would be FsShell.java (Nested under src/core/). DFS shell get/copy gives weird errors when permissions are wrong with directories - Key: HADOOP-7943 URL: https://issues.apache.org/jira/browse/HADOOP-7943 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.20.205.0 Reporter: Ben West Priority: Minor Labels: hdfs, shell Let /foo be a *directory* in HDFS (issue does not occur with files) and /bar be a local dir. Do something like: {code} $ chmod u-w /bar $ hadoop -get /foo/myfile /bar copyToLocal: Permission denied # correctly tells me permission is denied $ hadoop -get /foo /bar copyToLocal: null $ hadoop -get /foo/ /bar copyToLocal: No such file or directory {code} I've been banging my head for a bit trying to figure out why hadoop thinks my directory doesn't exist, but it turns out the problem was just with my local permissions. The Permission denied error would've been a lot nicer to get. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7942) enabling clover coverage reports fails hadoop unit test compilation
[ https://issues.apache.org/jira/browse/HADOOP-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186438#comment-13186438 ] Harsh J commented on HADOOP-7942: - If this is done, please mark as resolved or update the target and fix versions appropriately please. enabling clover coverage reports fails hadoop unit test compilation --- Key: HADOOP-7942 URL: https://issues.apache.org/jira/browse/HADOOP-7942 Project: Hadoop Common Issue Type: Test Affects Versions: 1.1.0 Environment: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-1-Code-Coverage/13/console Reporter: Giridharan Kesavan Assignee: Jitendra Nath Pandey Attachments: HADOOP-7942.branch-1.patch enabling clover reports fails compiling the following junit tests. link to the console output of jerkins : https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-1-Code-Coverage/13/console {noformat} [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestUserGroupInformation.java:224: cannot find symbol .. [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestUserGroupInformation.java:225: cannot find symbol .. [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestJobCredentials.java:67: cannot find symbol [javac] symbol : class T .. [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestJobCredentials.java:68: cannot find symbol [javac] symbol : class T . [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/fs/TestFileSystem.java:653: cannot find symbol [javac] symbol : class T . [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 5 errors [javac] 63 warnings {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7940) method clear() in org.apache.hadoop.io.Text does not work
[ https://issues.apache.org/jira/browse/HADOOP-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186440#comment-13186440 ] Harsh J commented on HADOOP-7940: - Hey Aaron, Would you be interested in contributing a fix and a test case for this as well? Thanks, Harsh method clear() in org.apache.hadoop.io.Text does not work - Key: HADOOP-7940 URL: https://issues.apache.org/jira/browse/HADOOP-7940 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20.205.0 Environment: Ubuntu, hadoop cloudera CDH3U2, Oracle SUN JDK 6U30 Reporter: Aaron, Original Estimate: 2h Remaining Estimate: 2h LineReader reader = new LineReader(in, 4096); ... Text text = new Text(); while((reader.readLine(text)) 0) { ... text.clear(); } } Even the clear() method is called each time, some bytes are still not filled as zero. So, when reader.readLine(text) is called in a loop, some bytes are dirty which was from last call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7297) Error in the documentation regarding Checkpoint/Backup Node
[ https://issues.apache.org/jira/browse/HADOOP-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181537#comment-13181537 ] Harsh J commented on HADOOP-7297: - Tony, Yes, will be fixed in 1.1.0's doc publishing. Sorry about this. Error in the documentation regarding Checkpoint/Backup Node --- Key: HADOOP-7297 URL: https://issues.apache.org/jira/browse/HADOOP-7297 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 0.20.205.0 Reporter: arnaud p Assignee: Harsh J Priority: Trivial Fix For: 0.20.203.1, 1.1.0 Attachments: hadoop-7297.patch, hadoop-7297.patch On http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_user_guide.html#Checkpoint+Node: the command bin/hdfs namenode -checkpoint required to launch the backup/checkpoint node does not exist. I have removed this from the docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7936) There's a Hoop README in the root dir of the tarball
[ https://issues.apache.org/jira/browse/HADOOP-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181115#comment-13181115 ] Harsh J commented on HADOOP-7936: - +1, that should fix it. There's a Hoop README in the root dir of the tarball Key: HADOOP-7936 URL: https://issues.apache.org/jira/browse/HADOOP-7936 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Alejandro Abdelnur Fix For: 0.23.1 Attachments: HADOOP-7936.patch The Hoop README.txt is now in the root dir of the tarball. {noformat} hadoop-trunk1 $ tar xvzf hadoop-dist/target/hadoop-0.24.0-SNAPSHOT.tar.gz -C /tmp/ .. hadoop-trunk1 $ head -n3 /tmp/hadoop-0.24.0-SNAPSHOT/README.txt - HttpFS - Hadoop HDFS over HTTP {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7466) Hadoop Log improvement Umbrella
[ https://issues.apache.org/jira/browse/HADOOP-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179490#comment-13179490 ] Harsh J commented on HADOOP-7466: - Regarding [~sureshms]'s comments, may make more sense if we split up log improvements package-wise instead of class-wise. I'm available for splitting up some work to refine log statements (much needed for making ops smile someday). Let me know if you need help. Hadoop Log improvement Umbrella --- Key: HADOOP-7466 URL: https://issues.apache.org/jira/browse/HADOOP-7466 Project: Hadoop Common Issue Type: Improvement Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Trivial Created one Umbrella issue and we can link the all log improvement issues to it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7951) viewfs fails unless all mount points are available
[ https://issues.apache.org/jira/browse/HADOOP-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179603#comment-13179603 ] Harsh J commented on HADOOP-7951: - Is this related to HADOOP-7950? viewfs fails unless all mount points are available -- Key: HADOOP-7951 URL: https://issues.apache.org/jira/browse/HADOOP-7951 Project: Hadoop Common Issue Type: Bug Components: fs, security Affects Versions: 0.24.0, 0.23.1 Reporter: Daryn Sharp Priority: Critical Obtaining a delegation token via viewfs will attempt to acquire tokens from all filesystems in the mount table. All clients that obtain tokens, including job submissions, will fail if any of the mount points are unavailable -- even if paths in the unavailable mount will not be accessed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7945) Document that Path objects do not support : in them.
[ https://issues.apache.org/jira/browse/HADOOP-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178704#comment-13178704 ] Harsh J commented on HADOOP-7945: - Am thinking it might just be better to document this and add a test case (that'd fail the day its fixed). Maybe this approach could cause incompatibility, as am not sure if there are ways to 'tweak' the Path into accepting : (escaped/encoded). Another character that causes issue is a slash in the filename itself, even if escaped/encoded. Document that Path objects do not support : in them. -- Key: HADOOP-7945 URL: https://issues.apache.org/jira/browse/HADOOP-7945 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.20.0 Reporter: Harsh J Priority: Critical Labels: newbie Attachments: HADOOP-7945.patch Until HADOOP-3257 is fixed, this particular exclusion should be documented. This is a major upsetter to many beginners. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7949) Server maxIdleTime is way too low
[ https://issues.apache.org/jira/browse/HADOOP-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179276#comment-13179276 ] Harsh J commented on HADOOP-7949: - +1, the changes look good to me. Thanks for 'constantizing' this! Server maxIdleTime is way too low - Key: HADOOP-7949 URL: https://issues.apache.org/jira/browse/HADOOP-7949 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hadoop-7949.txt HADOOP-2909 intended to set the server max idle time for a connection to twice the client value. (The server-side max idle time should be greater than the client-side max idle time, for example, twice of the client-side max idle time.) This way when a server times out a connection it's due a crashed client and not an inactive client so we don't close client connections with outstanding requests (by setting 2x the client value on the server side the client should time out the connection first). Looks like there was a typo in the patch and it set the default value to 1/5th the client value, instead of the intended 2x. {noformat} hadoop2 (pre-HADOOP-4687)$ git reset --hard 6fa4597e hadoop2 (pre-HADOOP-4687)$ grep -r ipc.client.connection.maxidletime . ./src/core/org/apache/hadoop/ipc/Client.java: conf.getInt(ipc.client.connection.maxidletime, 1); //10s ./src/core/org/apache/hadoop/ipc/Server.java:this.maxIdleTime = 2*conf.getInt(ipc.client.connection.maxidletime, 1000); {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7945) Document that Path objects do not support : in them.
[ https://issues.apache.org/jira/browse/HADOOP-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178186#comment-13178186 ] Harsh J commented on HADOOP-7945: - We can document this in Path's javadocs, I think that should suffice. But brownie points if you can think of somewhere more visible and relevant :) Document that Path objects do not support : in them. -- Key: HADOOP-7945 URL: https://issues.apache.org/jira/browse/HADOOP-7945 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.20.0 Reporter: Harsh J Priority: Critical Labels: newbie Until HADOOP-3257 is fixed, this particular exclusion should be documented. This is a major upsetter to many beginners. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7418) support for multiple slashes in the path separator
[ https://issues.apache.org/jira/browse/HADOOP-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178193#comment-13178193 ] Harsh J commented on HADOOP-7418: - Andrew, did you get a chance to take a look at why TestFcHdfsSymlink may have failed? Would you still be interested in pursuing this patch? :) support for multiple slashes in the path separator -- Key: HADOOP-7418 URL: https://issues.apache.org/jira/browse/HADOOP-7418 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0 Environment: Linux running JDK 1.6 Reporter: Sudharsan Sampath Assignee: Andrew Look Priority: Minor Labels: newbie Fix For: 0.24.0 Attachments: HADOOP-7418--20110719.txt, HADOOP-7418.txt, HADOOP-7418.txt, HDFS-1460.txt, HDFS-1460.txt the parsing of the input path string to identify the uri authority conflicts with the file system paths. For instance the following is a valid path in both the linux file system and the hdfs. //user/directory1//directory2. While this works perfectly fine in the command line for manipulating hdfs, the same fails when specified as the input path for a mapper class with the following expcetion. Exception in thread main java.net.UnknownHostException: unknown host: user at org.apache.hadoop.ipc.Client$Connection.init(Client.java:195) as the org.apache.hadoop.fs.Path class assumes the string that follows the '//' to be an uri authority -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-4603) Installation on Solaris needs additional PATH setting
[ https://issues.apache.org/jira/browse/HADOOP-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178207#comment-13178207 ] Harsh J commented on HADOOP-4603: - Hey Allen, if you can +1, we can commit changes similar to Kohsuke's into the trunk to at least have whoami working on Solaris (its a start…). Let me know if there really is no better way for sure. Installation on Solaris needs additional PATH setting - Key: HADOOP-4603 URL: https://issues.apache.org/jira/browse/HADOOP-4603 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.18.2 Environment: Solaris 10 x86 Reporter: Jon Brisbin Attachments: HADOOP-4603, id_instead_of_whoami.diff A default installation as outlined in the docs won't start on Solaris 10 x86. The whoami utility is in path /usr/ucb on Solaris 10, which isn't in the standard PATH environment variable unless the user has added that specifically. The documentation should reflect this. Solaris 10 also seemed to throw NPEs if you didn't explicitly set the IP address to bind the servers to. Simply overriding the IP address fixes the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7945) Document that Path objects do not support : in them.
[ https://issues.apache.org/jira/browse/HADOOP-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178208#comment-13178208 ] Harsh J commented on HADOOP-7945: - I guess that could bring in compatibility issues. (People begin expecting that to fail, and at some point in future a bug fix like HADOOP-3257 regresses against that thought.) But otherwise, that idea is sound. Do we emit no Invalid char in path form of messages today? If we do, then this is not a worry, we could inject looking for colon in that somehow. Want to take a shot at it Niels? :) Document that Path objects do not support : in them. -- Key: HADOOP-7945 URL: https://issues.apache.org/jira/browse/HADOOP-7945 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.20.0 Reporter: Harsh J Priority: Critical Labels: newbie Until HADOOP-3257 is fixed, this particular exclusion should be documented. This is a major upsetter to many beginners. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7945) Document that Path objects do not support : in them.
[ https://issues.apache.org/jira/browse/HADOOP-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178239#comment-13178239 ] Harsh J commented on HADOOP-7945: - Nope, given checkPathArg's positioning, that wouldn't work for path that are fully qualified (i.e. need a : as part of the {{file:///path/to/some/file/:/with/colon}}). Document that Path objects do not support : in them. -- Key: HADOOP-7945 URL: https://issues.apache.org/jira/browse/HADOOP-7945 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.20.0 Reporter: Harsh J Priority: Critical Labels: newbie Until HADOOP-3257 is fixed, this particular exclusion should be documented. This is a major upsetter to many beginners. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7713) dfs -count -q should label output column
[ https://issues.apache.org/jira/browse/HADOOP-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178297#comment-13178297 ] Harsh J commented on HADOOP-7713: - +1 for the change. Would you mind updating docs for this as well? dfs -count -q should label output column Key: HADOOP-7713 URL: https://issues.apache.org/jira/browse/HADOOP-7713 Project: Hadoop Common Issue Type: Improvement Reporter: Nigel Daley Assignee: Jonathan Allen Priority: Trivial Labels: newbie Attachments: HDFS-364.patch, HDFS-364.patch These commands should label the output columns: {code} hadoop dfs -count dir...dir hadoop dfs -count -q dir...dir {code} Current output of the 2nd command above: {code} % hadoop dfs -count -q /user/foo /tmp none inf 9569 9493 6372553322 hdfs://nn1.bar.com/user/foo none inf 101 2689 209349812906 hdfs://nn1.bar.com/tmp {code} It is not obvious what these columns mean. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7478) 0-byte files retained in the dfs while the FSshell -put is unsuccessful
[ https://issues.apache.org/jira/browse/HADOOP-7478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178298#comment-13178298 ] Harsh J commented on HADOOP-7478: - Exec'ing {{fs -put}} should return back a proper code in trunk, I think, given Daryn's work on the whole revamp. I'd argue that this behavior is OK, since it indicates that the file's inode was successfully created, but none of the writing of bytes passed. 0-byte files retained in the dfs while the FSshell -put is unsuccessful --- Key: HADOOP-7478 URL: https://issues.apache.org/jira/browse/HADOOP-7478 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: XieXianshan Priority: Trivial Attachments: HADOOP-7478.patch The process of putting file into dfs is approximately as follows: 1) create a file in the dfs 2) copy from one stream to the file But the problem is that the file is still retained in the dfs when the process 2) is terminated abnormally with unexpected exceptions,such as there is no DataNode alive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7574) Improvement for FSshell -stat
[ https://issues.apache.org/jira/browse/HADOOP-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178107#comment-13178107 ] Harsh J commented on HADOOP-7574: - Ah looks like I missed HDFS-2319. Getting to that now. Improvement for FSshell -stat - Key: HADOOP-7574 URL: https://issues.apache.org/jira/browse/HADOOP-7574 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: XieXianshan Priority: Trivial Fix For: 0.24.0 Attachments: HADOOP-7574-v0.2.patch, HADOOP-7574-v0.3.patch, HADOOP-7574.patch Add two optional formats for FSshell -stat, one is %G for group name of owner and the other is %U for user name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7574) Improvement for FSshell -stat
[ https://issues.apache.org/jira/browse/HADOOP-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178109#comment-13178109 ] Harsh J commented on HADOOP-7574: - Xie, in future consider it OK to attach as a single relevant patch now that the code base is merged :) Improvement for FSshell -stat - Key: HADOOP-7574 URL: https://issues.apache.org/jira/browse/HADOOP-7574 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: XieXianshan Priority: Trivial Fix For: 0.24.0 Attachments: HADOOP-7574-v0.2.patch, HADOOP-7574-v0.3.patch, HADOOP-7574.patch Add two optional formats for FSshell -stat, one is %G for group name of owner and the other is %U for user name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7944) Would like to update development getting started pages on hadoop wiki but don't have permission
[ https://issues.apache.org/jira/browse/HADOOP-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177686#comment-13177686 ] Harsh J commented on HADOOP-7944: - Mark, What's your exact username on wiki? I can add you into the writable group. Would like to update development getting started pages on hadoop wiki but don't have permission --- Key: HADOOP-7944 URL: https://issues.apache.org/jira/browse/HADOOP-7944 Project: Hadoop Common Issue Type: Task Components: documentation Affects Versions: 0.23.0 Environment: Ubuntu 10.04.2 LTS Reporter: Mark Pollack Priority: Minor I've created an account on the wiki but can't edit pages. The wiki page http://wiki.apache.org/hadoop/EclipseEnvironment has some out of date information, for example mvn test -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true cd ../; cd mapreduce; ant compile eclipse should be mvn install -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true 'install' is needed instead of test in order to get the build artifacts in the local M2 repo for reference in the eclipse build path. Also there isn't a 'mapreduce' directory anymore and the mvn eclipse:eclipse command craetes the necessary .project files under the 0.23 'hadoop-mapreduce-project' directory. I'd also like to add a blurb about using eclipse with m2e/m2eclipse. For a maven based project many devs would just import the root pom.xmls. With the new release of the m2e plug-in, this doesn't work anymore as pretty much all targets are not supported by the new 'connector framework' - yes, it is a giant mess. This means falling back to m2eclipse or just doing the eclipse generation as mention. Adding a pointer to http://wiki.apache.org/hadoop/HowToContribute such that the requirements to install java, maven, and protoc compilers would be helpful. The information on getting started from scratch seems a bit scattered and I'd like to help clean that up. Please let me know how I can help to contribute in this area. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7944) Would like to update development getting started pages on hadoop wiki but don't have permission
[ https://issues.apache.org/jira/browse/HADOOP-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177691#comment-13177691 ] Harsh J commented on HADOOP-7944: - Thanks for your username. I think I have an issue with my account presently where even I'm unable to edit the writable group. I'll need to get that resolved first (I'm sure I had access before…). @Other committers - If you're able to add MarkPollack in, please do :) Meanwhile, I'll try and add in your changes now to that page. Thanks! I'll resolve this when you're added and on your way :) Would like to update development getting started pages on hadoop wiki but don't have permission --- Key: HADOOP-7944 URL: https://issues.apache.org/jira/browse/HADOOP-7944 Project: Hadoop Common Issue Type: Task Components: documentation Affects Versions: 0.23.0 Environment: Ubuntu 10.04.2 LTS Reporter: Mark Pollack Priority: Minor I've created an account on the wiki but can't edit pages. The wiki page http://wiki.apache.org/hadoop/EclipseEnvironment has some out of date information, for example mvn test -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true cd ../; cd mapreduce; ant compile eclipse should be mvn install -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true 'install' is needed instead of test in order to get the build artifacts in the local M2 repo for reference in the eclipse build path. Also there isn't a 'mapreduce' directory anymore and the mvn eclipse:eclipse command craetes the necessary .project files under the 0.23 'hadoop-mapreduce-project' directory. I'd also like to add a blurb about using eclipse with m2e/m2eclipse. For a maven based project many devs would just import the root pom.xmls. With the new release of the m2e plug-in, this doesn't work anymore as pretty much all targets are not supported by the new 'connector framework' - yes, it is a giant mess. This means falling back to m2eclipse or just doing the eclipse generation as mention. Adding a pointer to http://wiki.apache.org/hadoop/HowToContribute such that the requirements to install java, maven, and protoc compilers would be helpful. The information on getting started from scratch seems a bit scattered and I'd like to help clean that up. Please let me know how I can help to contribute in this area. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7944) Would like to update development getting started pages on hadoop wiki but don't have permission
[ https://issues.apache.org/jira/browse/HADOOP-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177705#comment-13177705 ] Harsh J commented on HADOOP-7944: - I made some changes to both docs, let me know if it is satisfactory (though there's a lot more to improve, yes). I'll need to find some more time to rewrite this in better ways and also kick off a wiki effort again in general soon. Would like to update development getting started pages on hadoop wiki but don't have permission --- Key: HADOOP-7944 URL: https://issues.apache.org/jira/browse/HADOOP-7944 Project: Hadoop Common Issue Type: Task Components: documentation Affects Versions: 0.23.0 Environment: Ubuntu 10.04.2 LTS Reporter: Mark Pollack Priority: Minor I've created an account on the wiki but can't edit pages. The wiki page http://wiki.apache.org/hadoop/EclipseEnvironment has some out of date information, for example mvn test -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true cd ../; cd mapreduce; ant compile eclipse should be mvn install -DskipTests mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true 'install' is needed instead of test in order to get the build artifacts in the local M2 repo for reference in the eclipse build path. Also there isn't a 'mapreduce' directory anymore and the mvn eclipse:eclipse command craetes the necessary .project files under the 0.23 'hadoop-mapreduce-project' directory. I'd also like to add a blurb about using eclipse with m2e/m2eclipse. For a maven based project many devs would just import the root pom.xmls. With the new release of the m2e plug-in, this doesn't work anymore as pretty much all targets are not supported by the new 'connector framework' - yes, it is a giant mess. This means falling back to m2eclipse or just doing the eclipse generation as mention. Adding a pointer to http://wiki.apache.org/hadoop/HowToContribute such that the requirements to install java, maven, and protoc compilers would be helpful. The information on getting started from scratch seems a bit scattered and I'd like to help clean that up. Please let me know how I can help to contribute in this area. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7910) add configuration methods to handle human readable size values
[ https://issues.apache.org/jira/browse/HADOOP-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177100#comment-13177100 ] Harsh J commented on HADOOP-7910: - The javadoc warnings appear unrelated. Committing. {code} 6 warnings [WARNING] Javadoc Warnings [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1063: warning - Tag @link: can't find call(RpcKind, Writable, InetSocketAddress, [WARNING] Class, UserGroupInformation, int, Configuration) in org.apache.hadoop.ipc.Client [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1018: warning - Tag @link: can't find call(RpcKind, Writable, ConnectionId) in org.apache.hadoop.ipc.Client [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1048: warning - Tag @link: can't find call(RpcKind, Writable, ConnectionId) in org.apache.hadoop.ipc.Client [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1031: warning - Tag @link: can't find call(RpcKind, Writable, ConnectionId) in org.apache.hadoop.ipc.Client [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1093: warning - Tag @link: can't find call(RpcKind, Writable, ConnectionId) in org.apache.hadoop.ipc.Client [WARNING] /Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java:1007: warning - Tag @link: can't find call(RpcKind, Writable, ConnectionId) in org.apache.hadoop.ipc.Client {code} add configuration methods to handle human readable size values -- Key: HADOOP-7910 URL: https://issues.apache.org/jira/browse/HADOOP-7910 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Sho Shimauchi Assignee: Sho Shimauchi Priority: Minor Attachments: HADOOP-7910.patch, HADOOP-7910.patch, HADOOP-7910.patch, HADOOP-7910.patch.3, hadoop-7910.txt It's better to have a new configuration methods which handle human readable size values. For example, see HDFS-1314. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7348) Modify the option of FsShell getmerge from [addnl] to [-nl] for more comprehensive
[ https://issues.apache.org/jira/browse/HADOOP-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1317#comment-1317 ] Harsh J commented on HADOOP-7348: - Although this is an improvement, its an incompatible change so I've marked it so and updated release notes. Modify the option of FsShell getmerge from [addnl] to [-nl] for more comprehensive -- Key: HADOOP-7348 URL: https://issues.apache.org/jira/browse/HADOOP-7348 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: XieXianshan Fix For: 0.24.0 Attachments: HADOOP-7348-v0.3.patch, HADOOP-7348-v0.4.patch, HADOOP-7348.patch, HADOOP-7348.patch, HADOOP-7348.patch, HADOOP-7348.patch_2 The [addnl] option of FsShell getmerge should be either true or false,but it is very hard to understand by users, especially who`s never used this option before. So,the [addnl] option should be changed to [-nl] for more comprehensive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6986) SequenceFile.Reader should distinguish between Network IOE and Parsing IOE
[ https://issues.apache.org/jira/browse/HADOOP-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176828#comment-13176828 ] Harsh J commented on HADOOP-6986: - Sorry to come in very late here, but could you rebase the patch onto the 0.23/trunk branches? I think this is a good change, and +1 to the idea. I trust you've also covered all spots where we can tweak to also add in parse exceptions. SequenceFile.Reader should distinguish between Network IOE and Parsing IOE -- Key: HADOOP-6986 URL: https://issues.apache.org/jira/browse/HADOOP-6986 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.20-append, 0.21.1, 0.22.0 Reporter: Nicolas Spiegelberg Priority: Minor Attachments: HADOOP-6986_0.21.patch, HADOOP-6986_20-append.patch The SequenceFile.Reader api should give the user an easy way to distinguish between a Network/Low-level IOE and a Parsing IOE. The use case appeared recently in the HBase project: Originally, if a RegionServer got an IOE from HDFS while opening a region file, it would abort the open and let the HMaster reassign the region. The assumption being that this is a network failure that will likely disappear at a later time or different partition of the network. However, if HBase gets parsing exceptions, we want to log the problem and continue opening the region anyways, because parsing is an idempotent problem and retries won't fix this issue. Although this problem was found in HBase, it seems to be a generic problem of being able to more easily identify idempotent vs transient errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6897) FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call setPermission if mkdirs failled
[ https://issues.apache.org/jira/browse/HADOOP-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176833#comment-13176833 ] Harsh J commented on HADOOP-6897: - I second Nicholas' comment: bq. I think that the static mkdirs(..) is confusing to users. There are only a few lines of codes inside. Why don't we just deprecate it now and remove it in the future? This should already be covered by the FileSystem#mkdir/mkdirs methods now. I do not see why we should have a static method for mkdirs and create anymore. FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call setPermission if mkdirs failled - Key: HADOOP-6897 URL: https://issues.apache.org/jira/browse/HADOOP-6897 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.22.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Attachments: mkdirs.patch Here is the piece of code that has the bug. fs.setPermission should not be called if result is false. {code} public static boolean mkdirs(FileSystem fs, Path dir, FsPermission permission) throws IOException { // create the directory using the default permission boolean result = fs.mkdirs(dir); // set its permission to be the supplied one fs.setPermission(dir, permission); return result; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6858) Enable rotateable JVM garbage collection logs for Hadoop daemons
[ https://issues.apache.org/jira/browse/HADOOP-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176839#comment-13176839 ] Harsh J commented on HADOOP-6858: - This would certainly help HBase, given http://hbase.apache.org/book/trouble.log.html I'm not confident if what's being done is the best approach though (But looks reasonable to me) - so I'll leave the review to more able people. We'd still like to have this, if it is also sufficiently documented so that people use this way instead of adding in opts. Enable rotateable JVM garbage collection logs for Hadoop daemons Key: HADOOP-6858 URL: https://issues.apache.org/jira/browse/HADOOP-6858 Project: Hadoop Common Issue Type: New Feature Components: scripts Affects Versions: 0.22.0 Reporter: Andrew Ryan Attachments: HADOOP-6858.patch The purpose of this enhancement is to make it easier to collect garbage collection logs and insure that they persist across restarts in the same way that the standard output files of Hadoop daemon JVM's currently does. Garbage collection logs are a vital debugging tool for administrators and developers. In our production environments, at some point or another, every single type of Hadoop daemon has OOM'ed or experienced other significant issues related to GC and/or lack of heap memory. For the longest time, we have put in garbage collection logs in our HADOOP_NAMENODE_OPTS, HADOOP_JOBTRACKER_OPTS, etc. by using options like -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:$HADOOP_LOG_DIR/jobtracker.gc.log. Unfortunately, these logs don't survive a restart of the node, so if a node OOM's and then is restarted automatically, or manually by someone who is unaware, we lose the GC logs forever. We also have to manually add GC log options to each daemon. This patch: 1) Creates a single, optional, off by default, parameter for specifying GC logging. 2) If that parameter is set, automatically enables GC logging for all daemons in the cluster. The parameter is flexible enough to allow for the different ways various vendor's JVM's require garbage collection logging to be specified. 3) If GC logging is on, insures that the GC log files for each daemon are rotated with up to 5 copies kept, same as the .out files currently. We are currently running a variation of this patch in our 0.20 install. This patch actually includes changes to common, mapred, and hdfs, so it obviously cannot be applied as-is, but is included here for review and comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6871) When the value of a configuration key is set to its unresolved form, it causes the IllegalStateException in Configuration.get() stating that substitution depth is too
[ https://issues.apache.org/jira/browse/HADOOP-6871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176842#comment-13176842 ] Harsh J commented on HADOOP-6871: - +1, the approach looks good. Would you be willing to take a shot at rebasing this for trunk? When the value of a configuration key is set to its unresolved form, it causes the IllegalStateException in Configuration.get() stating that substitution depth is too large. - Key: HADOOP-6871 URL: https://issues.apache.org/jira/browse/HADOOP-6871 Project: Hadoop Common Issue Type: Bug Components: conf Reporter: Arvind Prabhakar Attachments: HADOOP-6871-1.patch, HADOOP-6871.patch When a configuration value is set to its unresolved expression string, it leads to recursive substitution attempts in {{Configuration.substituteVars(String)}} method until the max substitution check kicks in and raises an IllegalStateException indicating that the substitution depth is too large. For example, the configuration key {{foobar}} with a value set to {{$\{foobar\}}} will cause this behavior. While this is not a usual use case, it can happen in build environments where a property value is not specified and yet being passed into the test mechanism leading to failures due to this limitation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6704) add support for Parascale filesystem
[ https://issues.apache.org/jira/browse/HADOOP-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176843#comment-13176843 ] Harsh J commented on HADOOP-6704: - Am not sure if we still accept alternate FSes into the upstream because of the support it'd require even as a contrib module. Perhaps you'd be willing to instead continue carrying this on a github or apache-extras repository, outside of Apache Hadoop? add support for Parascale filesystem Key: HADOOP-6704 URL: https://issues.apache.org/jira/browse/HADOOP-6704 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.20.2 Reporter: Neil Bliss Attachments: HADOOP-6704-10.patch, HADOOP-6704-2.patch, HADOOP-6704-3.patch, HADOOP-6704-4.patch, HADOOP-6704-5.patch, HADOOP-6704-6.patch, HADOOP-6704-7.patch, HADOOP-6704-8.patch, HADOOP-6704.0.20.2.patch, HADOOP-6704.patch, HADOOP-6704_0_20_2-2.patch, HADOOP-6704_0_20_2-3.patch Original Estimate: 168h Remaining Estimate: 168h Parascale has developed an org.apache.hadoop.fs implementation that allows users to use Hadoop on Parascale storage clusters. We'd like to contribute this work to the community. Should this be placed under contrib, or integrated into the org.apache.hadoop.fs space? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-6153) RAgzip: multiple map tasks for a large gzipped file
[ https://issues.apache.org/jira/browse/HADOOP-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176844#comment-13176844 ] Harsh J commented on HADOOP-6153: - Sorry for coming in very late. The patch has gone quite stale for trunk today. Would you be willing to rebase? I've also linked in two related issues. Perhaps you may have comments on these alternative approaches, or they may on this? RAgzip: multiple map tasks for a large gzipped file --- Key: HADOOP-6153 URL: https://issues.apache.org/jira/browse/HADOOP-6153 Project: Hadoop Common Issue Type: Improvement Components: io, native Affects Versions: 0.21.0 Environment: It requires zlib 1.2.2.4 or higher. (We tested on zlib 1.2.3) Reporter: Daehyun Kim Assignee: Daehyun Kim Priority: Minor Attachments: HADOOP-6153.patch, hadoop-6153.txt It support to enable multiple map tasks for one large gzipped file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-353) Run datanode (or other hadoop servers) inside tomcat
[ https://issues.apache.org/jira/browse/HADOOP-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176856#comment-13176856 ] Harsh J commented on HADOOP-353: Is this still a problem today? If not, we may close this out. Run datanode (or other hadoop servers) inside tomcat Key: HADOOP-353 URL: https://issues.apache.org/jira/browse/HADOOP-353 Project: Hadoop Common Issue Type: Improvement Reporter: eric baldeschwieler Barry Kaplan is running hadoop data nodes inside tomcat and encountering some issues. http://issues.apache.org/jira/browse/HADOOP-211#action_12419360 I'm filing this bug to capture discussion about the pros and cons of such an approach. I'd be curious to know what others on the list (who know more about java/tomcat) think about this proposal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-2739) SequenceFile and MapFile should either be subclassable or they should be declared final
[ https://issues.apache.org/jira/browse/HADOOP-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176977#comment-13176977 ] Harsh J commented on HADOOP-2739: - +1 for making them final if there's no compatibility breakage. What do you gain by subclassing though? FWIW, metadata support is also provided for custom user tags already. SequenceFile and MapFile should either be subclassable or they should be declared final --- Key: HADOOP-2739 URL: https://issues.apache.org/jira/browse/HADOOP-2739 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.16.0 Reporter: Jim Kellerman Assignee: Jim Kellerman Priority: Minor Neither SequenceFile nor MapFile are currently subclassable as there are no accessor methods to their private members or member functions. Either protected accessor methods should be added or the member variables should be declared as protected instead of private. OR both SequenceFile and MapFile should be declared as final. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1860) I'd like a log4j appender that can write to a Hadoop FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176994#comment-13176994 ] Harsh J commented on HADOOP-1860: - I do not see what the merits of this would be. It would only slow things down. IMHO, logging is best left to non distributed FSes for being better that way. I'd like a log4j appender that can write to a Hadoop FileSystem --- Key: HADOOP-1860 URL: https://issues.apache.org/jira/browse/HADOOP-1860 Project: Hadoop Common Issue Type: New Feature Components: fs Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: hadoop-log4j.zip It would be convenient to be able to write log files to HDFS and other file systems directly. For large clusters, it will produce too many files, but for small clusters it should be usable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-3281) bin/hadoop script should check class name before running java
[ https://issues.apache.org/jira/browse/HADOOP-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177011#comment-13177011 ] Harsh J commented on HADOOP-3281: - I think a better approach would be to observe java's return code and throw out an appropriate message? Apart from making the 'feature' of running classes from 'hadoop classpath' via 'hadoop clazz' more visible via docs/helpstrings. bin/hadoop script should check class name before running java - Key: HADOOP-3281 URL: https://issues.apache.org/jira/browse/HADOOP-3281 Project: Hadoop Common Issue Type: Bug Components: scripts Reporter: Tsz Wo (Nicholas), SZE Assignee: Edward J. Yoon Attachments: 3281.patch, 3281_v01.patch When the first parameter ($1) cannot be matched with one of existing hadoop commnads, the parameter will be considered as a class name and the script will pass it to java. For examples, {noformat} bash-3.2$ ./bin/hadoop -version java version 1.5.0_14 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_14-b03) Java HotSpot(TM) Client VM (build 1.5.0_14-b03, mixed mode) bash-3.2$ ./bin/hadoop -help Usage: java [-options] class [args...] (to execute a class) or java [-options] -jar jarfile [args...] (to execute a jar file) ... {noformat} The behavior above is confusing. We should check whether the parameter is a valid class name before passing it to java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-3669) Aggregate Framework should allow usage of MultipleOutputFormat
[ https://issues.apache.org/jira/browse/HADOOP-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177013#comment-13177013 ] Harsh J commented on HADOOP-3669: - Also, MultipleOutputs replaces all other multiple-output mechanisms. Aggregate Framework should allow usage of MultipleOutputFormat -- Key: HADOOP-3669 URL: https://issues.apache.org/jira/browse/HADOOP-3669 Project: Hadoop Common Issue Type: Improvement Affects Versions: 0.17.0 Reporter: Ankur Assignee: Ankur Attachments: HADOOP-3669_v1.patch Currently the output format is hard-coded to be TextOutputFormat in ValueAggregatorJob responsible for running aggregation jobs using user-defined value aggregator descriptors. This prevents the application writer from specifying an alternate output format. A good use case from an application's perspective is to have a sub-type of MultipleOutputFormat set as output format which takes care of redirecting (key, value) to different files based on type information encoded in them. Applications can extend MultipleTextOutputFormat and define there own multiple output format but they still can't hook it into value aggregator framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7910) add configuration methods to handle human readable size values
[ https://issues.apache.org/jira/browse/HADOOP-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176104#comment-13176104 ] Harsh J commented on HADOOP-7910: - Thanks Sho. The patch looks good. I'll commit it once the following couple of nits are addressed: - {{fail();}} messages such as {{Too large number}} is not very descriptive. These calls will fail the test with the string as the reason, so something like Test passed for a number too large or Test passed for a number too small are more easier to understand when such a thing happens. Makes sense? (You have the other fail() message correctly written, so just these last two that relate to Longs). - The {{IllegalArgumentException}} that carries the message {{binary prefix is allowed only k, m, g, t, p, e(case insensitive)}} can be improved. Perhaps something more like: {{Invalid size prefix %char in given string %string. Allowed prefixes are set}}. Know that exception messages and log messages cater to users, and if we can be very clear at what's being given to them, it makes their life easier in hunting down the trouble and fixing it up themselves :) - Javadoc for the getLongBytes method can carry a 'case insensitive' comment, for the devs. add configuration methods to handle human readable size values -- Key: HADOOP-7910 URL: https://issues.apache.org/jira/browse/HADOOP-7910 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Sho Shimauchi Assignee: Sho Shimauchi Priority: Minor Attachments: HADOOP-7910.patch, HADOOP-7910.patch, HADOOP-7910.patch.3, hadoop-7910.txt It's better to have a new configuration methods which handle human readable size values. For example, see HDFS-1314. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7504) hadoop-metrics.properties missing some Ganglia31 options
[ https://issues.apache.org/jira/browse/HADOOP-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171821#comment-13171821 ] Harsh J commented on HADOOP-7504: - Given that this is a trivial change of adding comments to a template file, I shall commit it in a couple of days unless there are objections. hadoop-metrics.properties missing some Ganglia31 options - Key: HADOOP-7504 URL: https://issues.apache.org/jira/browse/HADOOP-7504 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 0.20.203.0, 0.23.0 Reporter: Eli Collins Assignee: Harsh J Priority: Trivial Labels: newbie Attachments: HADOOP-7504.r1.diff The jvm, rpc, and ugi sections of hadoop-metrics.properties should have Ganglia31 options like dfs and mapred -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7919) [Doc] Remove hadoop.logfile.* properties.
[ https://issues.apache.org/jira/browse/HADOOP-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171827#comment-13171827 ] Harsh J commented on HADOOP-7919: - Javadoc issues aren't from this patch. [Doc] Remove hadoop.logfile.* properties. - Key: HADOOP-7919 URL: https://issues.apache.org/jira/browse/HADOOP-7919 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HADOOP-7919.patch The following only resides in core-default.xml and doesn't look like its used anywhere at all. At least a grep of the prop name and parts of it does not give me back anything at all. These settings are now configurable via generic Log4J opts, via the shipped log4j.properties file in the distributions. {code} 137 !--- logging properties -- 138 139 property 140 namehadoop.logfile.size/name 141 value1000/value 142 descriptionThe max size of each log file/description 143 /property 144 145 property 146 namehadoop.logfile.count/name 147 value10/value 148 descriptionThe max number of log files/description 149 /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7919) [Doc] Remove hadoop.logfile.* properties.
[ https://issues.apache.org/jira/browse/HADOOP-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168693#comment-13168693 ] Harsh J commented on HADOOP-7919: - {code} ➜ hadoop git:(master) ✗ grep hadoop.logfile -R . ./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml: namehadoop.logfile.size/name ./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml: namehadoop.logfile.count/name ./hadoop-common-project/hadoop-common/target/classes/core-default.xml: namehadoop.logfile.size/name ./hadoop-common-project/hadoop-common/target/classes/core-default.xml: namehadoop.logfile.count/name {code} [Doc] Remove hadoop.logfile.* properties. - Key: HADOOP-7919 URL: https://issues.apache.org/jira/browse/HADOOP-7919 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Harsh J Priority: Trivial The following only resides in core-default.xml and doesn't look like its used anywhere at all. At least a grep of the prop name and parts of it does not give me back anything at all. These settings are now configurable via generic Log4J opts, via the shipped log4j.properties file in the distributions. {code} 137 !--- logging properties -- 138 139 property 140 namehadoop.logfile.size/name 141 value1000/value 142 descriptionThe max size of each log file/description 143 /property 144 145 property 146 namehadoop.logfile.count/name 147 value10/value 148 descriptionThe max number of log files/description 149 /property {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7781) Remove RecordIO
[ https://issues.apache.org/jira/browse/HADOOP-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159450#comment-13159450 ] Harsh J commented on HADOOP-7781: - We should un-deprecate it. Its seemingly used inside typedbytes of streaming today. Where else? Remove RecordIO --- Key: HADOOP-7781 URL: https://issues.apache.org/jira/browse/HADOOP-7781 Project: Hadoop Common Issue Type: Task Components: record Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Priority: Minor Fix For: 0.24.0 HADOOP-6155 deprecated RecordIO in 0.21. We should remove it from trunk, as nothing anymore uses it and the tests are taking up resources. We should attempt to remove record IO and also check for any references to it within the MR and HDFS projects. Meanwhile, Avro has come up as a fine replacement for it, and has been use inside Hadoop now for quite a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156660#comment-13156660 ] Harsh J commented on HADOOP-1381: - Todd, - The sync _interval_ can be arbitrary I think, can even be 0. Should not be negative, so I'll add a check for that instead. Or do you think its better if we limit the interval to a minimum? Writer tests pass with 0 no problem. - SYNC_INTERVAL is being used by MAPREDUCE right now, and I'll have to carry this out as a cross-project JIRA+patch. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7851) Configuration.getClasses() never returns the default value.
[ https://issues.apache.org/jira/browse/HADOOP-7851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156814#comment-13156814 ] Harsh J commented on HADOOP-7851: - Can we not test instead for an empty array and return the default value? This way we properly reuse the getTrimmedStrings function thats already present, instead of directly calling out the StringUtils one. Configuration.getClasses() never returns the default value. --- Key: HADOOP-7851 URL: https://issues.apache.org/jira/browse/HADOOP-7851 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Uma Maheswara Rao G Labels: configuration Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7851.patch Configuration.getClasses() never returns the default value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7851) Configuration.getClasses() never returns the default value.
[ https://issues.apache.org/jira/browse/HADOOP-7851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156816#comment-13156816 ] Harsh J commented on HADOOP-7851: - This problem exists in 0.20-security as well. Would love it if you can post a patch for that as well? Configuration.getClasses() never returns the default value. --- Key: HADOOP-7851 URL: https://issues.apache.org/jira/browse/HADOOP-7851 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Uma Maheswara Rao G Labels: configuration Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7851.patch Configuration.getClasses() never returns the default value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156954#comment-13156954 ] Harsh J commented on HADOOP-1381: - Yes, it would end up writing a marker after each record, as sync-writing condition is checked after every record append. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7806) [DNS] Support binding to sub-interfaces
[ https://issues.apache.org/jira/browse/HADOOP-7806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156560#comment-13156560 ] Harsh J commented on HADOOP-7806: - Findbugs warnings are unrelated to this patch. Similar for javadocs, which this does not touch at all. [DNS] Support binding to sub-interfaces --- Key: HADOOP-7806 URL: https://issues.apache.org/jira/browse/HADOOP-7806 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-7806.patch, HADOOP-7806.patch Right now, with the {{DNS}} class, we can look up IPs of provided interface names ({{eth0}}, {{vm1}}, etc.). However, it would be useful if the I/F - IP lookup also took a look at subinterfaces ({{eth0:1}}, etc.) and allowed binding to only a specified subinterface / virtual interface. This should be fairly easy to add, by matching against all available interfaces' subinterfaces via Java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7842) a compress tool for test compress/decompress locally use hadoop compress codecs
[ https://issues.apache.org/jira/browse/HADOOP-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154091#comment-13154091 ] Harsh J commented on HADOOP-7842: - Hello Kang, Could you explain the rationale behind this? Files are already decompressed by {{-cat}}, etc. shell utils, native programs like lzo's. If its for testing native lib loads, or compression codec loading, we already have TestCodec for that, which I've once explained how to use at: http://search-hadoop.com/m/RqwE52Sh1GI Given these two facts, I believe the only usefulness of adding this is to compress a file given a codec? Perhaps TestCodec may be extended to provide such a utility as well? a compress tool for test compress/decompress locally use hadoop compress codecs --- Key: HADOOP-7842 URL: https://issues.apache.org/jira/browse/HADOOP-7842 Project: Hadoop Common Issue Type: Improvement Components: io Reporter: Kang Xiao Assignee: Kang Xiao Labels: compression Attachments: HADOOP-7842.patch add a compress tool for test compress/decompress locally use hadoop compress codecs. It can be used as follows: compress a.txt to a.lzo: hadoop jar test.jar org.apache.hadoop.io.compress.CodecTool org.apache.hadoop.io.compress.LzoCodec -c 1024 1024 a.txt a.lzo decompress a.lzo to stdout: hadoop jar test.jar org.apache.hadoop.io.compress.CodecTool org.apache.hadoop.io.compress.LzoCodec -x 1024 1024 a.lzo /dev/stdout -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7841) Run tests with non-secure random
[ https://issues.apache.org/jira/browse/HADOOP-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153998#comment-13153998 ] Harsh J commented on HADOOP-7841: - +1, the addition of that sysprop looks good. Patch probably was without --no-prefix though, so failed to apply for build test. Run tests with non-secure random Key: HADOOP-7841 URL: https://issues.apache.org/jira/browse/HADOOP-7841 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Attachments: hadoop-7841.txt Post-mavenization we lost the improvement made by HADOOP-7335 which set up a system property such that Random is seeded by {{urandom}}. This makes the tests run faster and prevents timeouts due to lack of entropy on the build boxes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7795) add -n option for FSshell cat
[ https://issues.apache.org/jira/browse/HADOOP-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153458#comment-13153458 ] Harsh J commented on HADOOP-7795: - I wonder if we really need this. Given the type of data mostly stored on HDFS, how useful is it to provide an internal ability to print line numbers? It could be done, perhaps a small bit sub-optimally, by external programs using the output of fs-shell's -cat today. add -n option for FSshell cat - Key: HADOOP-7795 URL: https://issues.apache.org/jira/browse/HADOOP-7795 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.24.0 Reporter: XieXianshan Assignee: XieXianshan Fix For: 0.24.0 Attachments: HADOOP-7795.patch Add -n option for cat to display the files with line numbers.It's quite useful if you're reading big files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7806) [DNS] Support binding to sub-interfaces
[ https://issues.apache.org/jira/browse/HADOOP-7806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152169#comment-13152169 ] Harsh J commented on HADOOP-7806: - I do wish that there were a naming standard for these NIF names. Impl. could be as easy as a string split and search :) [DNS] Support binding to sub-interfaces --- Key: HADOOP-7806 URL: https://issues.apache.org/jira/browse/HADOOP-7806 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Right now, with the {{DNS}} class, we can look up IPs of provided interface names ({{eth0}}, {{vm1}}, etc.). However, it would be useful if the I/F - IP lookup also took a look at subinterfaces ({{eth0:1}}, etc.) and allowed binding to only a specified subinterface / virtual interface. This should be fairly easy to add, by matching against all available interfaces' subinterfaces via Java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7830) [DNS] Log a WARN when there's no matching interface and we are gonna use the default one.
[ https://issues.apache.org/jira/browse/HADOOP-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152667#comment-13152667 ] Harsh J commented on HADOOP-7830: - Parent JIRA is: HADOOP-7806 [DNS] Log a WARN when there's no matching interface and we are gonna use the default one. - Key: HADOOP-7830 URL: https://issues.apache.org/jira/browse/HADOOP-7830 Project: Hadoop Common Issue Type: Sub-task Components: util Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Fix For: 0.24.0 We do not currently log today when a requested interface is not found. Instead the code handles this by using the default interface instead, when the lookup result is null. We should log a WARN when this is done, instead of letting it sneak by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7825) Hadoop wrapper script not picking up native libs correctly
[ https://issues.apache.org/jira/browse/HADOOP-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151822#comment-13151822 ] Harsh J commented on HADOOP-7825: - Two things I'll need to investigate here before proposing to remove such a check: - What are native libs doing in lib/, when you can't guarantee their platform/arch without using external tools? - What utilizes their presence in lib/ in 205 today. Clearly, using 'file', tells you that those under lib/* are 32-bit binaries in the tar we ship today. This breaks native code loading in all 64-bit deployments. Hadoop wrapper script not picking up native libs correctly -- Key: HADOOP-7825 URL: https://issues.apache.org/jira/browse/HADOOP-7825 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.20.205.0 Environment: Debian 6.0 x64_64 java version 1.6.0_26 Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Reporter: stephen mulcahy Originally discussed in https://mail-archives.apache.org/mod_mbox/hadoop-common-user/20.mbox/%3C4EC3A3AE.7060402%40deri.org%3E I'm testing out native lib support on our test amd64 test cluster running 0.20.205 running the following ./bin/hadoop jar hadoop-test-0.20.205.0.jar testsequencefile -seed 0 -count 1000 -compressType RECORD xxx -codec org.apache.hadoop.io.compress.GzipCodec -check 2 it fails with WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Looking at bin/hadoop it seems to successfully detect that the native libs are available (they seem to come pre-compiled with 0.20.205 which is nice) if [ -d ${HADOOP_HOME}/lib/native ]; then if [ x$JAVA_LIBRARY_PATH != x ]; then JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_HOME}/lib/native/${JAVA_PLATFORM} else JAVA_LIBRARY_PATH=${HADOOP_HOME}/lib/native/${JAVA_PLATFORM} fi fi and sets JAVA_LIBRARY_PATH to contain them. Then in the following line, if ${HADOOP_HOME}/lib contains libhadoop.a (which is seems to in the stock tar) then it proceeds to ignore the native libs if [ -e ${HADOOP_PREFIX}/lib/libhadoop.a ]; then JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/lib fi The libhadoop.a in ${HADOOP_HOME}/lib seems to be a copy of the lib/native/Linux-i386-32 going from the filesizes (and also noted by https://mail-archives.apache.org/mod_mbox/hadoop-common-user/20.mbox/%3ccaocnvr2azudnn0lfhmtqumayujytvhfkmmm_j0r-bmxw2wu...@mail.gmail.com%3E) hadoop@testhbase01:~$ ls -la hadoop/lib/libhadoop.* -rw-r--r-- 1 hadoop hadoop 237244 Oct 7 08:20 hadoop/lib/libhadoop.a -rw-r--r-- 1 hadoop hadoop877 Oct 7 08:20 hadoop/lib/libhadoop.la -rw-r--r-- 1 hadoop hadoop 160438 Oct 7 08:20 hadoop/lib/libhadoop.so -rw-r--r-- 1 hadoop hadoop 160438 Oct 7 08:19 hadoop/lib/libhadoop.so.1.0.0 hadoop@testhbase01:~$ ls -la hadoop/lib/native/Linux-i386-32/ total 728 drwxr-xr-x 3 hadoop hadoop 4096 Nov 15 14:05 . drwxr-xr-x 5 hadoop hadoop 4096 Oct 7 08:24 .. -rw-r--r-- 1 hadoop hadoop 237244 Oct 7 08:20 libhadoop.a -rw-r--r-- 1 hadoop hadoop877 Oct 7 08:20 libhadoop.la -rw-r--r-- 1 hadoop hadoop 160438 Oct 7 08:20 libhadoop.so -rw-r--r-- 1 hadoop hadoop 160438 Oct 7 08:20 libhadoop.so.1 -rw-r--r-- 1 hadoop hadoop 160438 Oct 7 08:20 libhadoop.so.1.0.0 A possible solution includes removing libhadoop.a and friends from ${HADOOP_HOME}/lib and possibly also modifying the hadoop wrapper to remove if [ -e ${HADOOP_PREFIX}/lib/libhadoop.a ]; then JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/lib fi unless there is some other reason for this to exist. This was also noted in a comment to HADOOP-6453 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7781) Remove RecordIO
[ https://issues.apache.org/jira/browse/HADOOP-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138220#comment-13138220 ] Harsh J commented on HADOOP-7781: - Found no use of record package in HDFS. Found one count of use in MapReduce, easily removable via MAPREDUCE-3302. Remove RecordIO --- Key: HADOOP-7781 URL: https://issues.apache.org/jira/browse/HADOOP-7781 Project: Hadoop Common Issue Type: Task Components: record Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Priority: Minor Fix For: 0.24.0 HADOOP-6155 deprecated RecordIO in 0.21. We should remove it from trunk, as nothing anymore uses it and the tests are taking up resources. We should attempt to remove record IO and also check for any references to it within the MR and HDFS projects. Meanwhile, Avro has come up as a fine replacement for it, and has been use inside Hadoop now for quite a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7736) Remove duplicate call of Path#normalizePath during initialization.
[ https://issues.apache.org/jira/browse/HADOOP-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125621#comment-13125621 ] Harsh J commented on HADOOP-7736: - Thanks for clearing it up Jakob and Aaron. Thanks for the review as well! I've committed this to trunk. Remove duplicate call of Path#normalizePath during initialization. -- Key: HADOOP-7736 URL: https://issues.apache.org/jira/browse/HADOOP-7736 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 0.24.0 Attachments: HADOOP-7736.patch Found during code reading on HADOOP-6490, there seems to be an unnecessary call of {{normalizePath(...)}} being made in the constructor {{Path(Path, Path)}}. Since {{initialize(...)}} normalizes its received path string already, its unnecessary to do it to the path parameter in the constructor's call of the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7736) Remove duplicate call of Path#normalizePath during initialization.
[ https://issues.apache.org/jira/browse/HADOOP-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125596#comment-13125596 ] Harsh J commented on HADOOP-7736: - No tests cause the current tests that exist for that particular constructor already cover the change, and also pass. Remove duplicate call of Path#normalizePath during initialization. -- Key: HADOOP-7736 URL: https://issues.apache.org/jira/browse/HADOOP-7736 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Fix For: 0.24.0 Attachments: HADOOP-7736.patch Found during code reading on HADOOP-6490, there seems to be an unnecessary call of {{normalizePath(...)}} being made in the constructor {{Path(Path, Path)}}. Since {{initialize(...)}} normalizes its received path string already, its unnecessary to do it to the path parameter in the constructor's call of the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7542) Change XML format to 1.1 to add support for serializing additional characters
[ https://issues.apache.org/jira/browse/HADOOP-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117220#comment-13117220 ] Harsh J commented on HADOOP-7542: - Vinod - That test seems to pass for me here. I've attached a runlog of it along: http://pastebin.com/JFFBy2A8 {code} ➜ test git:(master) ✗ java -version java version 1.6.0_26 Java(TM) SE Runtime Environment (build 1.6.0_26-b03-384-10M3425) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-384, mixed mode) {code} Change XML format to 1.1 to add support for serializing additional characters - Key: HADOOP-7542 URL: https://issues.apache.org/jira/browse/HADOOP-7542 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 0.20.2 Reporter: Suhas Gogate Assignee: Michael Katzenellenbogen Fix For: 0.24.0 Attachments: HADOOP-7542-v1.patch, MAPREDUCE-109-v2.patch, MAPREDUCE-109-v3.patch, MAPREDUCE-109-v4.patch, MAPREDUCE-109.patch Feature added by this Jira has a problem while setting up some of the invalid xml characters e.g. ctrl-A e.g. mapred.textoutputformat.separator = \u0001 e,g, String delim = \u0001; Conf.set(mapred.textoutputformat.separator, delim); Job client serializes the jobconf with mapred.textoutputformat.separator set to \u0001 (ctrl-A) and problem happens when it is de-serialized (read back) by job tracker, where it encounters invalid xml character. The test for this feature public : testFormatWithCustomSeparator() does not serialize the jobconf after adding the separator as ctrl-A and hence does not detect the specific problem. Here is an exception: 08/12/06 01:40:50 INFO mapred.FileInputFormat: Total input paths to process : 1 org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference #1 is an invalid XML character. at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:961) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:864) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:832) at org.apache.hadoop.conf.Configuration.get(Configuration.java:291) at org.apache.hadoop.mapred.JobConf.getJobPriority(JobConf.java:1163) at org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:179) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1783) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) at org.apache.hadoop.ipc.Client.call(Client.java:715) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026) at -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7542) Change XML format to 1.1 to add support for serializing additional characters
[ https://issues.apache.org/jira/browse/HADOOP-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117222#comment-13117222 ] Harsh J commented on HADOOP-7542: - +1 to revert as this needs to be investigated more. Passes for me, but I need to try additional environments as previous test was from OSX's java. Please revert, and keep this open, thanks! :) Change XML format to 1.1 to add support for serializing additional characters - Key: HADOOP-7542 URL: https://issues.apache.org/jira/browse/HADOOP-7542 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 0.20.2 Reporter: Suhas Gogate Assignee: Michael Katzenellenbogen Fix For: 0.24.0 Attachments: HADOOP-7542-v1.patch, MAPREDUCE-109-v2.patch, MAPREDUCE-109-v3.patch, MAPREDUCE-109-v4.patch, MAPREDUCE-109.patch Feature added by this Jira has a problem while setting up some of the invalid xml characters e.g. ctrl-A e.g. mapred.textoutputformat.separator = \u0001 e,g, String delim = \u0001; Conf.set(mapred.textoutputformat.separator, delim); Job client serializes the jobconf with mapred.textoutputformat.separator set to \u0001 (ctrl-A) and problem happens when it is de-serialized (read back) by job tracker, where it encounters invalid xml character. The test for this feature public : testFormatWithCustomSeparator() does not serialize the jobconf after adding the separator as ctrl-A and hence does not detect the specific problem. Here is an exception: 08/12/06 01:40:50 INFO mapred.FileInputFormat: Total input paths to process : 1 org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference #1 is an invalid XML character. at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:961) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:864) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:832) at org.apache.hadoop.conf.Configuration.get(Configuration.java:291) at org.apache.hadoop.mapred.JobConf.getJobPriority(JobConf.java:1163) at org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:179) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1783) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) at org.apache.hadoop.ipc.Client.call(Client.java:715) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026) at -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7683) hdfs-site.xml template has properties that are not used in 20
[ https://issues.apache.org/jira/browse/HADOOP-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116104#comment-13116104 ] Harsh J commented on HADOOP-7683: - Three other files have references to this property. Incorrect in just the same way, so we'll require cleanup on them as well: src/hdfs/org/apache/hadoop/hdfs/DFSConfigKeys.java src/test/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java src/test/org/apache/hadoop/hdfs/TestHDFSServerPorts.java Prop in reference is {{DFS_NAMENODE_HTTP_ADDRESS_KEY}}. Sorry for pitching in late, was traveling yesterday. Arpit - can you update these files appropriately as well? Or let me know, and I'll do it. A separate (but linked) jira is fine I think. hdfs-site.xml template has properties that are not used in 20 - Key: HADOOP-7683 URL: https://issues.apache.org/jira/browse/HADOOP-7683 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.205.0 Reporter: Arpit Gupta Assignee: Arpit Gupta Priority: Minor Fix For: 0.20.205.0 Attachments: HADOOP-7683.20s.patch properties dfs.namenode.http-address and dfs.namenode.https-address should be removed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7683) hdfs-site.xml template has properties that are not used in 20
[ https://issues.apache.org/jira/browse/HADOOP-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116107#comment-13116107 ] Harsh J commented on HADOOP-7683: - Ah and additionally DFS_NAMENODE_HTTPS_ADDRESS_KEY and DFS_NAMENODE_BACKUP_HTTP_ADDRESS_KEY from src/hdfs/org/apache/hadoop/hdfs/DFSConfigKeys.java again, while we're at it. Unsure how that backup namenode thing got pulled in. We need to remove docs away for these too, which is at https://issues.apache.org/jira/browse/HADOOP-7297 awaiting a review for removal from branch security before I commit it in. hdfs-site.xml template has properties that are not used in 20 - Key: HADOOP-7683 URL: https://issues.apache.org/jira/browse/HADOOP-7683 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.205.0 Reporter: Arpit Gupta Assignee: Arpit Gupta Priority: Minor Fix For: 0.20.205.0 Attachments: HADOOP-7683.20s.patch properties dfs.namenode.http-address and dfs.namenode.https-address should be removed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira