[jira] [Commented] (HADOOP-8275) harden serialization logic against malformed or malicious input
[ https://issues.apache.org/jira/browse/HADOOP-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253871#comment-13253871 ] Tom White commented on HADOOP-8275: --- Can you write some unit tests for the new behaviour/method in this patch? It looks like the test in HDFS-3134 is only testing indirectly. harden serialization logic against malformed or malicious input --- Key: HADOOP-8275 URL: https://issues.apache.org/jira/browse/HADOOP-8275 Project: Hadoop Common Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HADOOP-8275.001.patch harden serialization logic against malformed or malicious input -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8209) Add option to relax build-version check for branch-1
[ https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252691#comment-13252691 ] Tom White commented on HADOOP-8209: --- I renamed VersionInfo#getBuildVersion to getFullVersion to clear up the distinction between the build's version and what was called the build version. Rather than renaming, maybe add getFullVersion() and deprecate getBuildVersion()? Also, is this change needed on trunk as well? Add option to relax build-version check for branch-1 Key: HADOOP-8209 URL: https://issues.apache.org/jira/browse/HADOOP-8209 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hadoop-8209.txt, hadoop-8209.txt In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie svn revision) do not match. TTs refuse to connect to JTs if their build *version* (version, revision, user, and source checksum) do not match. This prevents rolling upgrades, which is intentional, see the discussion in HADOOP-5203. The primary motivation in that jira was (1) it's difficult to guarantee every build on a large cluster got deployed correctly, builds don't get rolled back to old versions by accident etc, and (2) mixed versions can lead to execution problems that are hard to debug. However there are also cases when users know they two builds are compatible, eg when deploying a new build which contains the same contents as the previous one, plus a critical security patch that does not affect compatibility. Currently deploying a 1 line patch requires taking down the entire cluster (or trying to work around the issue by lying about the build revision or checksum, yuck). These users would like to be able to perform a rolling upgrade. In order to support this, let's add an option that is off by default, but, when enabled, makes the DN and TT version check just check for an exact version match (eg 1.0.2) but ignore the build revision (DN) and the source checksum (TT). Two builds still need to match the major, minor, and point numbers, but nothing else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8209) Add option to relax build-version check for branch-1
[ https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252808#comment-13252808 ] Tom White commented on HADOOP-8209: --- Agree that not changing VersionInfo is better. +1 from me if Jenkins comes back OK. Add option to relax build-version check for branch-1 Key: HADOOP-8209 URL: https://issues.apache.org/jira/browse/HADOOP-8209 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hadoop-8209.txt, hadoop-8209.txt, hadoop-8209.txt In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie svn revision) do not match. TTs refuse to connect to JTs if their build *version* (version, revision, user, and source checksum) do not match. This prevents rolling upgrades, which is intentional, see the discussion in HADOOP-5203. The primary motivation in that jira was (1) it's difficult to guarantee every build on a large cluster got deployed correctly, builds don't get rolled back to old versions by accident etc, and (2) mixed versions can lead to execution problems that are hard to debug. However there are also cases when users know they two builds are compatible, eg when deploying a new build which contains the same contents as the previous one, plus a critical security patch that does not affect compatibility. Currently deploying a 1 line patch requires taking down the entire cluster (or trying to work around the issue by lying about the build revision or checksum, yuck). These users would like to be able to perform a rolling upgrade. In order to support this, let's add an option that is off by default, but, when enabled, makes the DN and TT version check just check for an exact version match (eg 1.0.2) but ignore the build revision (DN) and the source checksum (TT). Two builds still need to match the major, minor, and point numbers, but nothing else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7030) new topology mapping implementations
[ https://issues.apache.org/jira/browse/HADOOP-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235873#comment-13235873 ] Tom White commented on HADOOP-7030: --- Steve - these are good features, but it's better to introduce API changes at the time the feature is added so that you and reviewers are sure the API fits the use case. new topology mapping implementations Key: HADOOP-7030 URL: https://issues.apache.org/jira/browse/HADOOP-7030 Project: Hadoop Common Issue Type: New Feature Affects Versions: 0.20.1, 0.20.2, 0.21.0 Reporter: Patrick Angeles Assignee: Tom White Attachments: HADOOP-7030-2.patch, HADOOP-7030.patch, HADOOP-7030.patch, HADOOP-7030.patch, topology.patch The default ScriptBasedMapping implementation of DNSToSwitchMapping for determining cluster topology has some drawbacks. Principally, it forks to an OS-specific script. This issue proposes two new Java implementations of DNSToSwitchMapping. TableMapping reads a two column text file that maps an IP or hostname to a rack ID. Ip4RangeMapping reads a three column text file where each line represents a start and end IP range plus a rack ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8200) Remove HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS
[ https://issues.apache.org/jira/browse/HADOOP-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236108#comment-13236108 ] Tom White commented on HADOOP-8200: --- +1 Remove HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS Key: HADOOP-8200 URL: https://issues.apache.org/jira/browse/HADOOP-8200 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Attachments: hadoop-8200.txt The HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS env variables are no longer in trunk/23 since there's no MR1 implementation and the tests don't use them. This makes the patch for HADOOP-8149 easier. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8167) Configuration deprecation logic breaks backwards compatibility
[ https://issues.apache.org/jira/browse/HADOOP-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229312#comment-13229312 ] Tom White commented on HADOOP-8167: --- #2 seems like the best option in this case. +1 for the patch. Configuration deprecation logic breaks backwards compatibility -- Key: HADOOP-8167 URL: https://issues.apache.org/jira/browse/HADOOP-8167 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Blocker Fix For: 0.23.3 Attachments: HADOOP-8167.patch The deprecated Configuration logic works as follows: For a dK deprecated key in favor of nK: * on set(dK, V), it stores (nK,V) * on get(dK) it does a reverseLookup of dK to nK and looks for get(nK) While this works fine for single set/get operations, the iterator() method that returns an iterator of all config key/values, returns only the new keys. This breaks applications that did a set(dK, V) and expect, when iterating over the configuration to find (dK, V). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8098) KerberosAuthenticatorHandler should use _HOST replacement to resolve principal name
[ https://issues.apache.org/jira/browse/HADOOP-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218419#comment-13218419 ] Tom White commented on HADOOP-8098: --- +1 KerberosAuthenticatorHandler should use _HOST replacement to resolve principal name --- Key: HADOOP-8098 URL: https://issues.apache.org/jira/browse/HADOOP-8098 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.2 Attachments: HADOOP-8098.patch Currently the exact Kerberos principal name has to be set in the configuration of each node. KerberosAuthenticatorHandler should do similar logic as the RPC ports to support HTTP/_HOST@REALM -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8082) add hadoop-client and hadoop-minicluster to the dependency-management section
[ https://issues.apache.org/jira/browse/HADOOP-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209552#comment-13209552 ] Tom White commented on HADOOP-8082: --- +1 add hadoop-client and hadoop-minicluster to the dependency-management section - Key: HADOOP-8082 URL: https://issues.apache.org/jira/browse/HADOOP-8082 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.2 Attachments: HADOOP-8082.patch This will allow other Hadoop sub-projects to use those artifacts without having to specify the version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8083) javadoc generation for some modules is not done under target/
[ https://issues.apache.org/jira/browse/HADOOP-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209733#comment-13209733 ] Tom White commented on HADOOP-8083: --- ${project.build.directory} is the {{target}} directory so you don't need to append {{target}} to the end of it. javadoc generation for some modules is not done under target/ - Key: HADOOP-8083 URL: https://issues.apache.org/jira/browse/HADOOP-8083 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.2 Attachments: HADOOP-8083.patch After running a clean build/dist in some modules an 'api/' directory shows up at module root level. The javadoc plugin should be configured to work under 'target/' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8083) javadoc generation for some modules is not done under target/
[ https://issues.apache.org/jira/browse/HADOOP-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210071#comment-13210071 ] Tom White commented on HADOOP-8083: --- +1 javadoc generation for some modules is not done under target/ - Key: HADOOP-8083 URL: https://issues.apache.org/jira/browse/HADOOP-8083 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.2 Attachments: HADOOP-8083.patch, HADOOP-8083.patch After running a clean build/dist in some modules an 'api/' directory shows up at module root level. The javadoc plugin should be configured to work under 'target/' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8074) Small bug in hadoop error message for unknown commands
[ https://issues.apache.org/jira/browse/HADOOP-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208922#comment-13208922 ] Tom White commented on HADOOP-8074: --- I think this should be reverted so Daryn's feedback can be addressed. Small bug in hadoop error message for unknown commands -- Key: HADOOP-8074 URL: https://issues.apache.org/jira/browse/HADOOP-8074 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 0.24.0 Reporter: Eli Collins Assignee: Colin Patrick McCabe Priority: Trivial Fix For: 0.23.2 Attachments: HADOOP-8074.txt The hadoop fs command should be more user friendly if the user forgets the dash before the command. Also, this should say cat rather than at. {noformat} hadoop-0.24.0-SNAPSHOT $ ./bin/hadoop fs cat at: Unknown command {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205535#comment-13205535 ] Tom White commented on HADOOP-8055: --- The patch looks fine to me, but I'm confused by why manually creating $HADOOP_HOME/etc/hadoop/core-site.xml avoids the problem - is there some deeper issue? Distribution tar.gz does not contain etc/hadoop/core-site.xml - Key: HADOOP-8055 URL: https://issues.apache.org/jira/browse/HADOOP-8055 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Eric Charles Assignee: Harsh J Attachments: HADOOP-8055.patch, HADOOP-8055.patch A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in $HADOOP_HOME/etc/hadoop/ folder. $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception Exception in thread main java.lang.IllegalArgumentException: URI has an authority component at java.io.File.init(File.java:368) at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178) ... Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem and hadoop starts fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8023) Add unset() method to Configuration
[ https://issues.apache.org/jira/browse/HADOOP-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201441#comment-13201441 ] Tom White commented on HADOOP-8023: --- +1 This is a compatible addition to the 1.x branch. Add unset() method to Configuration --- Key: HADOOP-8023 URL: https://issues.apache.org/jira/browse/HADOOP-8023 Project: Hadoop Common Issue Type: New Feature Components: conf Affects Versions: 1.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Attachments: HADOOP-8023.patch HADOOP-7001 introduced the *Configuration.unset(String)* method. MAPREDUCE-3727 requires that method in order to be back-ported. This is required to fix an issue manifested when running MR/Hive/Sqoop jobs from Oozie, details are in MAPREDUCE-3727. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8009) Create hadoop-client and hadoop-minicluster artifacts for downstream projects
[ https://issues.apache.org/jira/browse/HADOOP-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199844#comment-13199844 ] Tom White commented on HADOOP-8009: --- +1 for releasing these artifacts for active release branches. There are upcoming minor releases for the 1 and 23 branches (and possibly 22?), so it would be good to incorporate these artifacts into those releases if possible. Create hadoop-client and hadoop-minicluster artifacts for downstream projects -- Key: HADOOP-8009 URL: https://issues.apache.org/jira/browse/HADOOP-8009 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.22.0, 0.23.0, 0.24.0, 0.23.1, 1.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.23.1 Attachments: HADOOP-8009-existing-releases.patch, HADOOP-8009.patch Using Hadoop from projects like Pig/Hive/Sqoop/Flume/Oozie or any in-house system that interacts with Hadoop is quite challenging for the following reasons: * *Different versions of Hadoop produce different artifacts:* Before Hadoop 0.23 there was a single artifact hadoop-core, starting with Hadoop 0.23 there are several (common, hdfs, mapred*, yarn*) * *There are no 'client' artifacts:* Current artifacts include all JARs needed to run the services, thus bringing into clients several JARs that are not used for job submission/monitoring (servlet, jsp, tomcat, jersey, etc.) * *Doing testing on the client side is also quite challenging as more artifacts have to be included than the dependencies define:* for example, the history-server artifact has to be explicitly included. If using Hadoop 1 artifacts, jersey-server has to be explicitly included. * *3rd party dependencies change in Hadoop from version to version:* This makes things complicated for projects that have to deal with multiple versions of Hadoop as their exclusions list become a huge mix match of artifacts from different Hadoop versions and it may be break things when a particular version of Hadoop requires a dependency that other version of Hadoop does not require. Because of this it would be quite convenient to have the following 'aggregator' artifacts: * *org.apache.hadoop:hadoop-client* : it includes all required JARs to use Hadoop client APIs (excluding all JARs that are not needed for it) * *org.apache.hadoop:hadoop-test* : it includes all required JARs to run Hadoop Mini Clusters These aggregator artifacts would be created for current branches under development (trunk, 0.22, 0.23, 1.0) and for released versions that are still in use. For branches under development, these artifacts would be generated as part of the build. For released versions we would have a a special branch used only as vehicle for publishing the corresponding 'aggregator' artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8003) Make SplitCompressionInputStream an interface instead of an abstract class
[ https://issues.apache.org/jira/browse/HADOOP-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195856#comment-13195856 ] Tom White commented on HADOOP-8003: --- I think we should try to do this without breaking compatibility, e.g. by having a new SplittableCompressionCodec interface that returns a SplittableCompressionInputStream interface in its createInputStream method. Make SplitCompressionInputStream an interface instead of an abstract class -- Key: HADOOP-8003 URL: https://issues.apache.org/jira/browse/HADOOP-8003 Project: Hadoop Common Issue Type: New Feature Components: io Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0 Reporter: Tim Broberg To be splittable, a codec must extend SplittableCompressionCodec which has a function returning a SplitCompressionInputStream. SplitCompressionInputStream is an abstract class which extends CompressionInputStream, the lowest level compression stream class. So, no codec that wants to be splittable can reuse any code from DecompressorStream or BlockDecompressorStream. You either have to duplicate that code, or not be splittable. SplitCompressionInputStream adds just a few very thin functions. Can we make this an interface rather than an abstract class to allow splittable decompression streams to extend DecompressorStream, BlockDecompressorStream, or whatever else we should scheme up in the future? To my knowledge, this would impact only the BZip2 codec. None of the other implement this form of splittability yet. LineRecordReader looks only at whether the codec is an instance of SplittableCompressionCodec, and then calls the appropriate version of createInputStream. This would not change, so the application code should not have to change, just BZip and SplitCompressionInputStream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7980) API Compatibility between 0.23 and 1.0 in org.apache.hadoop.io.compress.Decompressor
[ https://issues.apache.org/jira/browse/HADOOP-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188557#comment-13188557 ] Tom White commented on HADOOP-7980: --- Implementations of custom decompressors can add the new method (but not mark it with @Override) and it will work in both 1 and 0.23, I think. Would that solve your issue? API Compatibility between 0.23 and 1.0 in org.apache.hadoop.io.compress.Decompressor Key: HADOOP-7980 URL: https://issues.apache.org/jira/browse/HADOOP-7980 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 0.23.1 Reporter: Jonathan Eagles HADOOP-6835 introduced in org.apache.hadoop.io.compress.Decompressor the public int getRemaining() API. The forces custom decompressors to implement the new API in order to continue to be used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7939) Improve Hadoop subcomponent integration in Hadoop 0.23
[ https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185384#comment-13185384 ] Tom White commented on HADOOP-7939: --- What is the reason to revert symlink resolution code back? Agreed - the work in HADOOP-7089 came up with a solution for symlink resolution which we should continue to use. Improve Hadoop subcomponent integration in Hadoop 0.23 -- Key: HADOOP-7939 URL: https://issues.apache.org/jira/browse/HADOOP-7939 Project: Hadoop Common Issue Type: Improvement Components: build, conf, documentation, scripts Affects Versions: 0.23.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.1 Attachments: HADOOP-7939.patch.txt, hadoop-layout.sh h1. Introduction For the rest of this proposal it is assumed that the current set of Hadoop subcomponents is: * hadoop-common * hadoop-hdfs * hadoop-yarn * hadoop-mapreduce It must be noted that this is an open ended list, though. For example, implementations of additional frameworks on top of yarn (e.g. MPI) would also be considered a subcomponent. h1. Problem statement Currently there's an unfortunate coupling and hard-coding present at the level of launcher scripts, configuration scripts and Java implementation code that prevents us from treating all subcomponents of Hadoop independently of each other. In a lot of places it is assumed that bits and pieces from individual subcomponents *must* be located at predefined places and they can not be dynamically registered/discovered during the runtime. This prevents a truly flexible deployment of Hadoop 0.23. h1. Proposal NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. The goal here is to keep as much of that layout in place as possible, while permitting different deployment layouts. The aim of this proposal is to introduce the needed level of indirection and flexibility in order to accommodate the current assumed layout of Hadoop tarball deployments and all the other styles of deployments as well. To this end the following set of environment variables needs to be uniformly used in all of the subcomponent's launcher scripts, configuration scripts and Java code (SC stands for a literal name of a subcomponent). These variables are expected to be defined by SC-env.sh scripts and sourcing those files is expected to have the desired effect of setting the environment up correctly. # HADOOP_SC_HOME ## root of the subtree in a filesystem where a subcomponent is expected to be installed ## default value: $0/.. # HADOOP_SC_JARS ## a subdirectory with all of the jar files comprising subcomponent's implementation ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC) # HADOOP_SC_EXT_JARS ## a subdirectory with all of the jar files needed for extended functionality of the subcomponent (nonessential for correct work of the basic functionality) ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/ext # HADOOP_SC_NATIVE_LIBS ## a subdirectory with all the native libraries that component requires ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/native # HADOOP_SC_BIN ## a subdirectory with all of the launcher scripts specific to the client side of the component ## default value: $(HADOOP_SC_HOME)/bin # HADOOP_SC_SBIN ## a subdirectory with all of the launcher scripts specific to the server/system side of the component ## default value: $(HADOOP_SC_HOME)/sbin # HADOOP_SC_LIBEXEC ## a subdirectory with all of the launcher scripts that are internal to the implementation and should *not* be invoked directly ## default value: $(HADOOP_SC_HOME)/libexec # HADOOP_SC_CONF ## a subdirectory containing configuration files for a subcomponent ## default value: $(HADOOP_SC_HOME)/conf # HADOOP_SC_DATA ## a subtree in the local filesystem for storing component's persistent state ## default value: $(HADOOP_SC_HOME)/data # HADOOP_SC_LOG ## a subdirectory for subcomponents's log files to be stored ## default value: $(HADOOP_SC_HOME)/log # HADOOP_SC_RUN ## a subdirectory with runtime system specific information ## default value: $(HADOOP_SC_HOME)/run # HADOOP_SC_TMP ## a subdirectory with temprorary files ## default value: $(HADOOP_SC_HOME)/tmp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7920) Remove Avro RPC
[ https://issues.apache.org/jira/browse/HADOOP-7920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185434#comment-13185434 ] Tom White commented on HADOOP-7920: --- The remaining classes are for AvroSerialization, not for RPC, so they shouldn't be removed. Remove Avro RPC --- Key: HADOOP-7920 URL: https://issues.apache.org/jira/browse/HADOOP-7920 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 0.23.1 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-2970.patch, HADOOP-7920.patch, HADOOP-7920.patch, HADOOP-7920.txt Please see the discussion in HDFS-2660 for more details. I have created a branch HADOOP-6659 to save the Avro work, if in the future some one wants to use the work that existed to add support for Avro RPC. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7934) Normalize dependencies versions across all modules
[ https://issues.apache.org/jira/browse/HADOOP-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180672#comment-13180672 ] Tom White commented on HADOOP-7934: --- +1 Thanks for running the tests. Normalize dependencies versions across all modules -- Key: HADOOP-7934 URL: https://issues.apache.org/jira/browse/HADOOP-7934 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7934.patch, HADOOP-7934.patch Move all dependencies versions to the dependencyManagement section in the hadoop-project POM Move all plugin versions to the dependencyManagement section in the hadoop-project POM -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7937) Forward port SequenceFile#syncFs and friends from Hadoop 1.x
[ https://issues.apache.org/jira/browse/HADOOP-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180962#comment-13180962 ] Tom White commented on HADOOP-7937: --- The javac warning is because sync() is deprecated. Given that they're just used via DFSClient Should DFSClient in trunk/23 use these methods, given that the append implementation is different to the one in 20? If so then that will need fixing in an HDFS JIRA. I'll go ahead and commit this. Forward port SequenceFile#syncFs and friends from Hadoop 1.x Key: HADOOP-7937 URL: https://issues.apache.org/jira/browse/HADOOP-7937 Project: Hadoop Common Issue Type: New Feature Components: io Affects Versions: 0.22.0, 0.23.1 Reporter: Eli Collins Assignee: Tom White Labels: bigtop Attachments: HADOOP-7937.patch HDFS-200 added a new public API SequenceFile#syncFs, we need to forward port this for compatibility. Looks like it might have introduced other APIs that need forward porting as well (eg LocaltedBlocks#setFileLength, and DataNode#getBlockInfo). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7934) Normalize dependencies versions across all modules
[ https://issues.apache.org/jira/browse/HADOOP-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178915#comment-13178915 ] Tom White commented on HADOOP-7934: --- +1 This looks good to me. Have you been able to run the unit tests successfully? Normalize dependencies versions across all modules -- Key: HADOOP-7934 URL: https://issues.apache.org/jira/browse/HADOOP-7934 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7934.patch Move all dependencies versions to the dependencyManagement section in the hadoop-project POM Move all plugin versions to the dependencyManagement section in the hadoop-project POM -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7939) Improve Hadoop subcomponent integration in Hadoop 0.23
[ https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177241#comment-13177241 ] Tom White commented on HADOOP-7939: --- bq. We already have almost *all* of the proposed variables, the trouble is that their naming and usage is very inconsistent. That's what this proposal was aiming at correcting. A patch that fixes the inconsistencies for 0.23 and trunk would be great to have, and is separate from introducing any new environment variables. Would you be able to contribute such a patch? Also, ideally as a user I'd be able to set one HOME environment variable and have all the others set to reasonable defaults. I'm not sure how true that is in 0.23/trunk today. Improve Hadoop subcomponent integration in Hadoop 0.23 -- Key: HADOOP-7939 URL: https://issues.apache.org/jira/browse/HADOOP-7939 Project: Hadoop Common Issue Type: Improvement Components: build, conf, documentation, scripts Affects Versions: 0.23.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.1 h1. Introduction For the rest of this proposal it is assumed that the current set of Hadoop subcomponents is: * hadoop-common * hadoop-hdfs * hadoop-yarn * hadoop-mapreduce It must be noted that this is an open ended list, though. For example, implementations of additional frameworks on top of yarn (e.g. MPI) would also be considered a subcomponent. h1. Problem statement Currently there's an unfortunate coupling and hard-coding present at the level of launcher scripts, configuration scripts and Java implementation code that prevents us from treating all subcomponents of Hadoop independently of each other. In a lot of places it is assumed that bits and pieces from individual subcomponents *must* be located at predefined places and they can not be dynamically registered/discovered during the runtime. This prevents a truly flexible deployment of Hadoop 0.23. h1. Proposal NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. The goal here is to keep as much of that layout in place as possible, while permitting different deployment layouts. The aim of this proposal is to introduce the needed level of indirection and flexibility in order to accommodate the current assumed layout of Hadoop tarball deployments and all the other styles of deployments as well. To this end the following set of environment variables needs to be uniformly used in all of the subcomponent's launcher scripts, configuration scripts and Java code (SC stands for a literal name of a subcomponent). These variables are expected to be defined by SC-env.sh scripts and sourcing those files is expected to have the desired effect of setting the environment up correctly. # HADOOP_SC_HOME ## root of the subtree in a filesystem where a subcomponent is expected to be installed ## default value: $0/.. # HADOOP_SC_JARS ## a subdirectory with all of the jar files comprising subcomponent's implementation ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC) # HADOOP_SC_EXT_JARS ## a subdirectory with all of the jar files needed for extended functionality of the subcomponent (nonessential for correct work of the basic functionality) ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/ext # HADOOP_SC_NATIVE_LIBS ## a subdirectory with all the native libraries that component requires ## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/native # HADOOP_SC_BIN ## a subdirectory with all of the launcher scripts specific to the client side of the component ## default value: $(HADOOP_SC_HOME)/bin # HADOOP_SC_SBIN ## a subdirectory with all of the launcher scripts specific to the server/system side of the component ## default value: $(HADOOP_SC_HOME)/sbin # HADOOP_SC_LIBEXEC ## a subdirectory with all of the launcher scripts that are internal to the implementation and should *not* be invoked directly ## default value: $(HADOOP_SC_HOME)/libexec # HADOOP_SC_CONF ## a subdirectory containing configuration files for a subcomponent ## default value: $(HADOOP_SC_HOME)/conf # HADOOP_SC_DATA ## a subtree in the local filesystem for storing component's persistent state ## default value: $(HADOOP_SC_HOME)/data # HADOOP_SC_LOG ## a subdirectory for subcomponents's log files to be stored ## default value: $(HADOOP_SC_HOME)/log # HADOOP_SC_RUN ## a subdirectory with runtime system specific information ## default value: $(HADOOP_SC_HOME)/run # HADOOP_SC_TMP ## a subdirectory with temprorary files ## default value: $(HADOOP_SC_HOME)/tmp -- This message is automatically generated by JIRA.
[jira] [Commented] (HADOOP-7732) hadoop java docs bad pointer to hdfs package
[ https://issues.apache.org/jira/browse/HADOOP-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176718#comment-13176718 ] Tom White commented on HADOOP-7732: --- Hi Matt, there's some more background at MAPREDUCE-559, which removed the public facing HDFS javadoc and added the javadoc-dev ant target for developers. In trunk the HDFS classes are marked @Private, so they would not show up in javadoc anyway, since we use the ExcludePrivateAnnotationsStandardDoclet when generating it. I wonder if the link just needs to be to the HDFS documentation? E.g. http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html. hadoop java docs bad pointer to hdfs package Key: HADOOP-7732 URL: https://issues.apache.org/jira/browse/HADOOP-7732 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 0.20.204.0, 0.20.205.0 Reporter: Arpit Gupta Assignee: Matt Foley Priority: Minor Attachments: HADOOP-7732.patch the following link http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/hdfs/package-summary.html leads to a 404 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7732) hadoop java docs bad pointer to hdfs package
[ https://issues.apache.org/jira/browse/HADOOP-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176312#comment-13176312 ] Tom White commented on HADOOP-7732: --- Since HDFS is accessed by users through FileSystem (in Common), we haven't published javadoc for it in the past since its not a part of the public API. There is an option to generate HDFS javadoc that developers can use locally. hadoop java docs bad pointer to hdfs package Key: HADOOP-7732 URL: https://issues.apache.org/jira/browse/HADOOP-7732 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 0.20.204.0, 0.20.205.0 Reporter: Arpit Gupta Assignee: Matt Foley Priority: Minor Attachments: HADOOP-7732.patch the following link http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/hdfs/package-summary.html leads to a 404 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7912) test-patch should run eclipse:eclipse to verify that it does not break again
[ https://issues.apache.org/jira/browse/HADOOP-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167683#comment-13167683 ] Tom White commented on HADOOP-7912: --- Apart from a typo (checkEclpiseGeneration) this looks good. What testing did you do? test-patch should run eclipse:eclipse to verify that it does not break again Key: HADOOP-7912 URL: https://issues.apache.org/jira/browse/HADOOP-7912 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: HADOOP-7912.txt Recently the eclipse:eclipse build was broken. If we are going to document this on the wiki and have many developers use it we should verify that it always works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7884) test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.
[ https://issues.apache.org/jira/browse/HADOOP-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167676#comment-13167676 ] Tom White commented on HADOOP-7884: --- The test-patch builds run with the base directory set to hadoop-{common,hdfs,mapreduce}-project. We could change common so it is run from the top-level, thus allowing cross-project patches, but it would mean that i) unit tests for every project are run for every change, and ii) only patches relative to the top-level would apply. i) is probably a good thing (changes to common should result in HDFS and MapReduce tests being run). For ii), do folks generally produce patches from the top-level anyway? test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist. Key: HADOOP-7884 URL: https://issues.apache.org/jira/browse/HADOOP-7884 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Alejandro Abdelnur Fix For: 0.24.0 Take for example HDFS-2178, the patch applies cleanly, but test-patch fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7884) test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.
[ https://issues.apache.org/jira/browse/HADOOP-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167773#comment-13167773 ] Tom White commented on HADOOP-7884: --- As things stand, test-patch will always do a build from the top-level to check that there are no breakages, but tests are run from hadoop-{common,hdfs,mapreduce}-project so only tests from the relevant project are run. Re: nightly builds, there is one for each project that does a full test-run and build: https://builds.apache.org/job/Hadoop-Common-trunk/, https://builds.apache.org/job/Hadoop-Hdfs-trunk/, https://builds.apache.org/job/Hadoop-Mapreduce-trunk/. Perhaps we need a top-level one. test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist. Key: HADOOP-7884 URL: https://issues.apache.org/jira/browse/HADOOP-7884 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Alejandro Abdelnur Fix For: 0.24.0 Take for example HDFS-2178, the patch applies cleanly, but test-patch fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7899) Generate proto java files as part of the build
[ https://issues.apache.org/jira/browse/HADOOP-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166500#comment-13166500 ] Tom White commented on HADOOP-7899: --- +1 Generate proto java files as part of the build -- Key: HADOOP-7899 URL: https://issues.apache.org/jira/browse/HADOOP-7899 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7899v1.patch, HADOOP-7899v1.sh currently the generated java files are precompiled and checked in into the source. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7810) move hadoop archive to core from tools
[ https://issues.apache.org/jira/browse/HADOOP-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166711#comment-13166711 ] Tom White commented on HADOOP-7810: --- +1 for the trunk patch. move hadoop archive to core from tools -- Key: HADOOP-7810 URL: https://issues.apache.org/jira/browse/HADOOP-7810 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.20.205.0, 0.24.0, 0.23.1, 1.0.0 Reporter: John George Assignee: John George Priority: Blocker Attachments: HADOOP-7810v1-trunk.patch, HADOOP-7810v1-trunk.sh, hadoop-7810.branch-0.20-security.patch, hadoop-7810.branch-0.20-security.patch, hadoop-7810.branch-0.20-security.patch The HadoopArchieves classes are included in the $HADOOP_HOME/hadoop_tools.jar, but this file is not found in `hadoop classpath`. A Pig script using HCatalog's dynamic partitioning with HAR enabled will therefore fail if a jar with HAR is not included in the pig call's '-cp' and '-Dpig.additional.jars' arguments. I am not aware of any reason to not include hadoop-tools.jar in 'hadoop classpath'. Will attach a patch soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7738) Document incompatible API changes between 0.20.20x and 0.23.0 release
[ https://issues.apache.org/jira/browse/HADOOP-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163731#comment-13163731 ] Tom White commented on HADOOP-7738: --- Here are some notes on the differences I have found so far between the 0.20.x release series and 0.23.0 which break compatibility in some way. * MAPREDUCE-954 changed some context classes to interfaces (e.g. JobContext, MapContext, TaskAttemptContext, TaskInputOutputContext). This change should not impact user code (since such code doesn't implement these interfaces) although it does mean that user code (including libraries like Pig) will need to be recompiled. See note (1) at http://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_packages * MAPREDUCE-901 changes Counter from a class to an interface. Clients need to recompile. * HADOOP-6201 changed FileSystem#listStatus to throw FileNotFoundException when the file is not found, rather than returning null. Clients need to review usage of this method and update their code to handle this case. Document incompatible API changes between 0.20.20x and 0.23.0 release - Key: HADOOP-7738 URL: https://issues.apache.org/jira/browse/HADOOP-7738 Project: Hadoop Common Issue Type: Improvement Reporter: Tom White Assignee: Tom White Priority: Blocker Fix For: 0.23.1 Attachments: apicheck-hadoop-0.20.204.0-0.24.0-SNAPSHOT.txt 0.20.20x to 0.23.0 will be a common upgrade path, so we should document any incompatible API changes that will affect users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir
[ https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161071#comment-13161071 ] Tom White commented on HADOOP-7874: --- Sounds reasonable. What testing did you do on the current patch? native libs should be under lib/native/ dir --- Key: HADOOP-7874 URL: https://issues.apache.org/jira/browse/HADOOP-7874 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Labels: bigtop Attachments: HADOOP-7874.patch Currently common and hdfs SO files end up under lib/ dir with all JARs, they should end up under lib/native. In addition, the hadoop-config.sh script needs some cleanup when comes to native lib handling: * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it should use lib/native. * it is looking for build/lib/native, this is from the old ant build, not applicable anymore. * it is looking for the libhdfs.a and adding to the java.librar.path, this is not correct. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir
[ https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161260#comment-13161260 ] Tom White commented on HADOOP-7874: --- +1 native libs should be under lib/native/ dir --- Key: HADOOP-7874 URL: https://issues.apache.org/jira/browse/HADOOP-7874 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Labels: bigtop Attachments: HADOOP-7874.patch Currently common and hdfs SO files end up under lib/ dir with all JARs, they should end up under lib/native. In addition, the hadoop-config.sh script needs some cleanup when comes to native lib handling: * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it should use lib/native. * it is looking for build/lib/native, this is from the old ant build, not applicable anymore. * it is looking for the libhdfs.a and adding to the java.librar.path, this is not correct. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir
[ https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160509#comment-13160509 ] Tom White commented on HADOOP-7874: --- it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it should use lib/native Why has the arch directory been dropped? native libs should be under lib/native/ dir --- Key: HADOOP-7874 URL: https://issues.apache.org/jira/browse/HADOOP-7874 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.24.0, 0.23.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Labels: bigtop Attachments: HADOOP-7874.patch Currently common and hdfs SO files end up under lib/ dir with all JARs, they should end up under lib/native. In addition, the hadoop-config.sh script needs some cleanup when comes to native lib handling: * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it should use lib/native. * it is looking for build/lib/native, this is from the old ant build, not applicable anymore. * it is looking for the libhdfs.a and adding to the java.librar.path, this is not correct. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155295#comment-13155295 ] Tom White commented on HADOOP-7590: --- Arun, I don't feel strongly where streaming lives. Alejandro, was there a reason streaming can't go in hadoop-mapreduce-project? Mavenize streaming and MR examples -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.24.0, 0.23.1 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh, HADOOP-7590v8.patch, HADOOP-7590v8.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152972#comment-13152972 ] Tom White commented on HADOOP-7590: --- I don't disagree with this assessment, but can't the work to make streaming work from the command line be done in this JIRA? Regarding the rant: is there a JIRA to fix this? Mavenize streaming and MR examples -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.1 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153255#comment-13153255 ] Tom White commented on HADOOP-7590: --- Looks good to me, +1. I think it's OK to fix the assembly part in a follow-up as long as it gets into 0.23.1. Mavenize streaming and MR examples -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.1 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh, HADOOP-7590v8.patch, HADOOP-7590v8.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152387#comment-13152387 ] Tom White commented on HADOOP-7590: --- I think the tests are relevant as they act as regression tests. * TestStreamingBadRecords, TestStreamingStatus and MiniMRClientClusterFactory are failing for me with a NPE caused by the new MiniMRClientClusterFactory. Can you check this please? * TestStreamingCombiner is failing because the counter can't be found. This could be a problem in MR on YARN, so it would be OK to look at this separately. Mavenize streaming and MR examples -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.1 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152660#comment-13152660 ] Tom White commented on HADOOP-7590: --- How about opening a JIRA for these failing tests and mark it as a blocker? Sounds reasonable to me. Still missing, but it should be also another JIRA, is how streaming JAR gets in the classpath of the shell command. Do you mean that with this patch you can't run streaming jobs from the command line? Currently it's possible (albeit tricky since it requires mvn and ant) to build a distribution that supports streaming jobs, so we should continue to support that in this patch. Mavenize streaming and MR examples -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.1 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7809) Backport HADOOP-5839 to 0.20-security - fixes to ec2 scripts to allow remote job submission
[ https://issues.apache.org/jira/browse/HADOOP-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146779#comment-13146779 ] Tom White commented on HADOOP-7809: --- Joydeep, have you looked at Whirr (http://whirr.apache.org/)? The EC2 scripts in Hadoop were deprecated in favour of Whirr well over a year ago. Backport HADOOP-5839 to 0.20-security - fixes to ec2 scripts to allow remote job submission --- Key: HADOOP-7809 URL: https://issues.apache.org/jira/browse/HADOOP-7809 Project: Hadoop Common Issue Type: Improvement Components: contrib/cloud Reporter: Joydeep Sen Sarma Assignee: Matt Foley Attachments: hadoop-5839.2.patch The fix for HADOOP-5839 was committed to 0.21 more than a year ago. This bug is to backport the change (which is only 14 lines) to branch-0.20-security. === Original description: i would very much like the option of submitting jobs from a workstation outside ec2 to a hadoop cluster in ec2. This has been explored here: http://www.nabble.com/public-IP-for-datanode-on-EC2-tt19336240.html the net result of this is that we can make this work (along with using a socks proxy) with a couple of changes in the ec2 scripts: a) use public 'hostname' for fs.default.name setting (instead of the private hostname being used currently) b) mark hadoop.rpc.socket.factory.class.default as final variable in the generated hadoop-site.xml (that applies to server side) #a has no downside as far as i can tell since public hostnames resolve to internal/private IP addresses within ec2 (so traffic is optimally routed). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7801) HADOOP_PREFIX cannot be overriden
[ https://issues.apache.org/jira/browse/HADOOP-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145737#comment-13145737 ] Tom White commented on HADOOP-7801: --- +1 This is a reasonable change. Nit: the change to fuse_dfs_wrapper.sh is not needed since HADOOP_PREFIX is only assigned if it empty or unset. HADOOP_PREFIX cannot be overriden - Key: HADOOP-7801 URL: https://issues.apache.org/jira/browse/HADOOP-7801 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Bruno Mahé Attachments: HADOOP-7801-2.patch, HADOOP-7801.patch hadoop-config.sh forces HADOOP_prefix to a specific value: export HADOOP_PREFIX=`dirname $this`/.. It would be nice to make this overridable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7802) hadoop script unconditionnaly source $bin/../libexec/hadoop-config.sh
[ https://issues.apache.org/jira/browse/HADOOP-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145739#comment-13145739 ] Tom White commented on HADOOP-7802: --- +1 hadoop script unconditionnaly source $bin/../libexec/hadoop-config.sh --- Key: HADOOP-7802 URL: https://issues.apache.org/jira/browse/HADOOP-7802 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.23.0 Reporter: Bruno Mahé Attachments: HADOOP-7802-2.patch, HADOOP-7802.patch It would be nice to be able to specify some other location for hadoop-config.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7787) binary and source tarball names are not consistent
[ https://issues.apache.org/jira/browse/HADOOP-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145747#comment-13145747 ] Tom White commented on HADOOP-7787: --- I would suggest hadoop-${project.version}.tar.gz for the binary and hadoop-${project.version}-src.tar.gz for the source. We haven't used a -bin suffix in Hadoop before, and this is consistent with the 0.23.0 release candidate Arun created last week (http://people.apache.org/~acmurthy/hadoop-0.23.0-rc2/). binary and source tarball names are not consistent -- Key: HADOOP-7787 URL: https://issues.apache.org/jira/browse/HADOOP-7787 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Bruno Mahé Attachments: HADOOP-7787.patch When building binary and source tarballs, I get the following artifacts: Binary tarball: hadoop-0.23.0-SNAPSHOT.tar.gz Source tarball: hadoop-dist-0.23.0-SNAPSHOT-src.tar.gz Notice the -dist right between hadoop and the version in the source tarball name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7590) Mavenize the MR1 JARs (main and test) creation
[ https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141305#comment-13141305 ] Tom White commented on HADOOP-7590: --- Alejandro, the problem you are seeing won't be a problem with MAPREDUCE-3169 (since the tests will run under YARN, not MR1), so perhaps you could make that issue a dependency? Mavenize the MR1 JARs (main and test) creation -- Key: HADOOP-7590 URL: https://issues.apache.org/jira/browse/HADOOP-7590 Project: Hadoop Common Issue Type: Sub-task Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.24.0 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh MR1 code is still available in MR2 for testing contribs. While this is a temporary until contribs tests are ported to MR2. As a follow up the contrib projects themselves should be mavenized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7785) Add equals, hashcode, toString to DataChecksum
[ https://issues.apache.org/jira/browse/HADOOP-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140941#comment-13140941 ] Tom White commented on HADOOP-7785: --- +1 Add equals, hashcode, toString to DataChecksum -- Key: HADOOP-7785 URL: https://issues.apache.org/jira/browse/HADOOP-7785 Project: Hadoop Common Issue Type: Improvement Components: io, util Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hadoop-7785.txt Simple patch to add these functions to the DataChecksum interface. This is handy for the sake of HDFS-2130. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7782) Add a way to merge javadocs?
[ https://issues.apache.org/jira/browse/HADOOP-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139442#comment-13139442 ] Tom White commented on HADOOP-7782: --- The short answer is mvn javadoc:aggregate -Dmaxmemory=1024. However, we should only publish the public API, which means we should use the annotation-aware doclet, and we should group the API by project using the groups parameter. More at http://maven.apache.org/plugins/maven-javadoc-plugin/examples/aggregate.html and http://maven.apache.org/plugins/maven-javadoc-plugin/aggregate-mojo.html I can look at this one if you like. Add a way to merge javadocs? Key: HADOOP-7782 URL: https://issues.apache.org/jira/browse/HADOOP-7782 Project: Hadoop Common Issue Type: Bug Components: build Reporter: Arun C Murthy Priority: Critical With 'mvn javadoc:javadoc' we now get docs spread over the maven modules. Is there a way to stich them all together? Also, there are some differences in their generation: hadoop-auth and hadoop-yarn-* hadoop-mapreduce-* modules goto a top-level apidocs dir which isn't the case for hadoop-common and hadoop-hdfs - they go straight to target/site/api. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7741) Maven related JIRAs to backport to 0.23
[ https://issues.apache.org/jira/browse/HADOOP-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13136056#comment-13136056 ] Tom White commented on HADOOP-7741: --- HADOOP-7763 Maven related JIRAs to backport to 0.23 --- Key: HADOOP-7741 URL: https://issues.apache.org/jira/browse/HADOOP-7741 Project: Hadoop Common Issue Type: Task Components: build Affects Versions: 0.23.0 Reporter: Alejandro Abdelnur Fix For: 0.23.0 HADOOP-7624 HDFS-2294 MAPREDUCE-3014 HDFS-2322 HADOOP-7642 MAPREDUCE-3171 HADOOP-7737 MAPREDUCE-3177 MAPREDUCE-3003 HADOOP-7590 MAPREDUCE-3024 HADOOP-7538 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7741) Maven related JIRAs to backport to 0.23
[ https://issues.apache.org/jira/browse/HADOOP-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13136618#comment-13136618 ] Tom White commented on HADOOP-7741: --- HADOOP-7768 Maven related JIRAs to backport to 0.23 --- Key: HADOOP-7741 URL: https://issues.apache.org/jira/browse/HADOOP-7741 Project: Hadoop Common Issue Type: Task Components: build Affects Versions: 0.23.0 Reporter: Alejandro Abdelnur Fix For: 0.23.0 HADOOP-7624 HDFS-2294 MAPREDUCE-3014 HDFS-2322 HADOOP-7642 MAPREDUCE-3171 HADOOP-7737 MAPREDUCE-3177 MAPREDUCE-3003 HADOOP-7590 MAPREDUCE-3024 HADOOP-7538 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7758) Make GlobFilter class public
[ https://issues.apache.org/jira/browse/HADOOP-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131134#comment-13131134 ] Tom White commented on HADOOP-7758: --- +1 this looks better. Add some class-level javadoc too? Make GlobFilter class public Key: HADOOP-7758 URL: https://issues.apache.org/jira/browse/HADOOP-7758 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.0, 0.24.0 Attachments: HDFS-2474.patch, HDFS-2474.patch Currently the GlobFilter class is package private. As a generic filter it is quite useful (and I've found myself doing cutpaste of it a few times) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7755) Detect MapReduce PreCommit Trunk builds silently failing when running test-patch.sh
[ https://issues.apache.org/jira/browse/HADOOP-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130017#comment-13130017 ] Tom White commented on HADOOP-7755: --- +1 Detect MapReduce PreCommit Trunk builds silently failing when running test-patch.sh --- Key: HADOOP-7755 URL: https://issues.apache.org/jira/browse/HADOOP-7755 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Blocker Attachments: HADOOP-7755.patch MapReduce PreCommit build is silently failing only running a very small portion of tests. The build then errors out, yet +1 it given to the patch. Last known Success build - 307 tests run and passed https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/990/testReport/ First known Error build - 69 tests run and passed https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/994/testReport/ Snippet from failed build log - Errors out and then +1 the patch https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/994/console [INFO] [INFO] Reactor Summary: [INFO] [INFO] hadoop-yarn-api ... SUCCESS [19.512s] [INFO] hadoop-yarn-common FAILURE [13.835s] [INFO] hadoop-yarn-server-common . SKIPPED [INFO] hadoop-yarn-server-nodemanager SKIPPED [INFO] hadoop-yarn-server-resourcemanager SKIPPED [INFO] hadoop-yarn-server-tests .. SKIPPED [INFO] hadoop-yarn-server SKIPPED [INFO] hadoop-yarn-applications-distributedshell . SKIPPED [INFO] hadoop-yarn-applications .. SKIPPED [INFO] hadoop-yarn-site .. SKIPPED [INFO] hadoop-yarn ... SKIPPED [INFO] hadoop-mapreduce-client-core .. SKIPPED [INFO] hadoop-mapreduce-client-common SKIPPED [INFO] hadoop-mapreduce-client-shuffle ... SKIPPED [INFO] hadoop-mapreduce-client-app ... SKIPPED [INFO] hadoop-mapreduce-client-hs SKIPPED [INFO] hadoop-mapreduce-client-jobclient . SKIPPED [INFO] hadoop-mapreduce-client ... SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 33.784s [INFO] Finished at: Wed Oct 12 12:03:22 UTC 2011 [INFO] Final Memory: 40M/630M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.2-beta-5:single (tar) on project hadoop-yarn-common: Failed to create assembly: Error adding file 'org.apache.hadoop:hadoop-yarn-api:jar:0.24.0-SNAPSHOT' to archive: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/target/classes isn't a file. - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-yarn-common == == Running contrib tests. == == /bin/kill -9 27543 kill: No such process NOP == == Checking the integrity of system test framework code. == == /bin/kill -9 27548 kill: No such process NOP +1 overall. Here are the results of testing the latest attachment
[jira] [Commented] (HADOOP-7737) normalize hadoop-mapreduce hadoop-dist dist/tar build with common/hdfs
[ https://issues.apache.org/jira/browse/HADOOP-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126880#comment-13126880 ] Tom White commented on HADOOP-7737: --- +1 I tried with and without -Dtar. I managed to run a MR job from the exploded directory. {noformat} export HADOOP_COMMON_HOME=$(pwd)/$(ls -d hadoop-common-project/hadoop-common/target/hadoop-common-*-SNAPSHOT) export HADOOP_HDFS_HOME=$(pwd)/$(ls -d hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*-SNAPSHOT) export HADOOP_MAPRED_HOME=$(pwd)/$(ls -d hadoop-mapreduce-project/target/hadoop-mapreduce-*-SNAPSHOT) export YARN_HOME=$HADOOP_MAPRED_HOME export PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$HADOOP_MAPRED_HOME/bin:$PATH cat $YARN_HOME/conf/yarn-site.xml EOF ?xml version=1.0? configuration !-- Site specific YARN configuration properties -- property nameyarn.nodemanager.aux-services/name valuemapreduce.shuffle/value /property property nameyarn.nodemanager.aux-services.mapreduce.shuffle.class/name valueorg.apache.hadoop.mapred.ShuffleHandler/value /property /configuration EOF cd hadoop-mapreduce-project ant examples -Dresolvers=internal cd .. export HADOOP_CLASSPATH=$YARN_HOME/modules/* mkdir in cp BUILDING.txt in/ hadoop jar hadoop-mapreduce-project/build/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar wordcount -Dmapreduce.job.user.name=$USER in out {noformat} I'll add this to http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment after it's committed. normalize hadoop-mapreduce hadoop-dist dist/tar build with common/hdfs Key: HADOOP-7737 URL: https://issues.apache.org/jira/browse/HADOOP-7737 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.0, 0.24.0 Attachments: HADOOP-7737.patch, HADOOP-7737.patch Normalize the build fo hadoop-mapreduce and hadoop-dist with hadoop-common and hadoop-hdfs making the -Pdist and -Dtar maven options to be consistent. * -Pdist should create the layout * -Dtar should create the TAR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7738) Document incompatible API changes between 0.20.20x and 0.23.0 release
[ https://issues.apache.org/jira/browse/HADOOP-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126984#comment-13126984 ] Tom White commented on HADOOP-7738: --- I would like to eliminate the false positives (e.g. by excluding them from SigTest), and (time permitting) go through the remaining ones so they can either be fixed if possible, or documented in release notes. Document incompatible API changes between 0.20.20x and 0.23.0 release - Key: HADOOP-7738 URL: https://issues.apache.org/jira/browse/HADOOP-7738 Project: Hadoop Common Issue Type: Improvement Reporter: Tom White Assignee: Tom White Priority: Blocker Fix For: 0.23.0 Attachments: apicheck-hadoop-0.20.204.0-0.24.0-SNAPSHOT.txt 0.20.20x to 0.23.0 will be a common upgrade path, so we should document any incompatible API changes that will affect users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7743) Add Maven profile to create a full source tarball
[ https://issues.apache.org/jira/browse/HADOOP-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127014#comment-13127014 ] Tom White commented on HADOOP-7743: --- +1 Add Maven profile to create a full source tarball - Key: HADOOP-7743 URL: https://issues.apache.org/jira/browse/HADOOP-7743 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.0, 0.24.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 0.23.0, 0.24.0 Attachments: HADOOP-7743.patch, HADOOP-7743.patch Currently we are building binary distributions only. We should also build a full source distribution from where Hadoop can be built. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-7035) Document incompatible API changes between releases
[ https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117776#comment-13117776 ] Tom White commented on HADOOP-7035: --- 1. Is it possible that the tool needs to be applied the other way around, that is having 0.22 as the base and Tested version being 0.21? The method signature changed, which is reported as the method being removed. In 0.21 it was {code} protected SequenceFile.Reader createDataFileReader(FileSystem fs, Path dataFile, Configuration conf) {code} And in 0.22 it is {code} protected SequenceFile.Reader createDataFileReader(Path dataFile, Configuration conf, SequenceFile.Reader.Option... options) {code} 2. Did you run the tool against MR only? Hard to believe there were no API changes in HDFS and common. I ran it against all three. HDFS is marked as @Private, so it won't show up in the report. 3. What is the final goal of this jira. Is it to identify incompatible changes and make a patch for site with the release notes? Yes, including it in the release notes would be a good start. If so we can filter out non public changes from the reports generated by SigTest and probably those that do not belong to public APIs in turns of Hadoop annotations, if it makes sense. The script already uses the annotations to restrict the changes to the public API. Document incompatible API changes between releases -- Key: HADOOP-7035 URL: https://issues.apache.org/jira/browse/HADOOP-7035 Project: Hadoop Common Issue Type: Improvement Components: documentation Reporter: Tom White Assignee: Tom White Priority: Blocker Fix For: 0.22.0 Attachments: apicheck-hadoop-0.20.203.0-0.20.204.0.txt, apicheck-hadoop-0.21.0-0.22.0-SNAPSHOT.txt, jdiff-with-previous-release.sh, jdiff-with-previous-release.sh We can use JDiff to generate a list of incompatible changes for each release. See https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira