[jira] [Commented] (HADOOP-8275) harden serialization logic against malformed or malicious input

2012-04-13 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253871#comment-13253871
 ] 

Tom White commented on HADOOP-8275:
---

Can you write some unit tests for the new behaviour/method in this patch? It 
looks like the test in HDFS-3134 is only testing indirectly.

 harden serialization logic against malformed or malicious input
 ---

 Key: HADOOP-8275
 URL: https://issues.apache.org/jira/browse/HADOOP-8275
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-8275.001.patch


 harden serialization logic against malformed or malicious input

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8209) Add option to relax build-version check for branch-1

2012-04-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252691#comment-13252691
 ] 

Tom White commented on HADOOP-8209:
---

 I renamed VersionInfo#getBuildVersion to getFullVersion to clear up the 
 distinction between the build's version and what was called the build 
 version.

Rather than renaming, maybe add getFullVersion() and deprecate 
getBuildVersion()? Also, is this change needed on trunk as well?

 Add option to relax build-version check for branch-1
 

 Key: HADOOP-8209
 URL: https://issues.apache.org/jira/browse/HADOOP-8209
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: hadoop-8209.txt, hadoop-8209.txt


 In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie 
 svn revision) do not match. TTs refuse to connect to JTs if their build 
 *version* (version, revision, user, and source checksum) do not match.
 This prevents rolling upgrades, which is intentional, see the discussion in 
 HADOOP-5203. The primary motivation in that jira was (1) it's difficult to 
 guarantee every build on a large cluster got deployed correctly, builds don't 
 get rolled back to old versions by accident etc, and (2) mixed versions can 
 lead to execution problems that are hard to debug.
 However there are also cases when users know they two builds are compatible, 
 eg when deploying a new build which contains the same contents as the 
 previous one, plus a critical security patch that does not affect 
 compatibility. Currently deploying a 1 line patch requires taking down the 
 entire cluster (or trying to work around the issue by lying about the build 
 revision or checksum, yuck). These users would like to be able to perform a 
 rolling upgrade.
 In order to support this, let's add an option that is off by default, but, 
 when enabled, makes the DN and TT version check just check for an exact 
 version match (eg 1.0.2) but ignore the build revision (DN) and the source 
 checksum (TT). Two builds still need to match the major, minor, and point 
 numbers, but nothing else.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8209) Add option to relax build-version check for branch-1

2012-04-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252808#comment-13252808
 ] 

Tom White commented on HADOOP-8209:
---

Agree that not changing VersionInfo is better. +1 from me if Jenkins comes back 
OK.

 Add option to relax build-version check for branch-1
 

 Key: HADOOP-8209
 URL: https://issues.apache.org/jira/browse/HADOOP-8209
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: hadoop-8209.txt, hadoop-8209.txt, hadoop-8209.txt


 In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie 
 svn revision) do not match. TTs refuse to connect to JTs if their build 
 *version* (version, revision, user, and source checksum) do not match.
 This prevents rolling upgrades, which is intentional, see the discussion in 
 HADOOP-5203. The primary motivation in that jira was (1) it's difficult to 
 guarantee every build on a large cluster got deployed correctly, builds don't 
 get rolled back to old versions by accident etc, and (2) mixed versions can 
 lead to execution problems that are hard to debug.
 However there are also cases when users know they two builds are compatible, 
 eg when deploying a new build which contains the same contents as the 
 previous one, plus a critical security patch that does not affect 
 compatibility. Currently deploying a 1 line patch requires taking down the 
 entire cluster (or trying to work around the issue by lying about the build 
 revision or checksum, yuck). These users would like to be able to perform a 
 rolling upgrade.
 In order to support this, let's add an option that is off by default, but, 
 when enabled, makes the DN and TT version check just check for an exact 
 version match (eg 1.0.2) but ignore the build revision (DN) and the source 
 checksum (TT). Two builds still need to match the major, minor, and point 
 numbers, but nothing else.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7030) new topology mapping implementations

2012-03-22 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235873#comment-13235873
 ] 

Tom White commented on HADOOP-7030:
---

Steve - these are good features, but it's better to introduce API changes at 
the time the feature is added so that you and reviewers are sure the API fits 
the use case.



 new topology mapping implementations
 

 Key: HADOOP-7030
 URL: https://issues.apache.org/jira/browse/HADOOP-7030
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.20.1, 0.20.2, 0.21.0
Reporter: Patrick Angeles
Assignee: Tom White
 Attachments: HADOOP-7030-2.patch, HADOOP-7030.patch, 
 HADOOP-7030.patch, HADOOP-7030.patch, topology.patch


 The default ScriptBasedMapping implementation of DNSToSwitchMapping for 
 determining cluster topology has some drawbacks. Principally, it forks to an 
 OS-specific script.
 This issue proposes two new Java implementations of DNSToSwitchMapping. 
 TableMapping reads a two column text file that maps an IP or hostname to a 
 rack ID. Ip4RangeMapping reads a three column text file where each line 
 represents a start and end IP range plus a rack ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8200) Remove HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS

2012-03-22 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236108#comment-13236108
 ] 

Tom White commented on HADOOP-8200:
---

+1

 Remove HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS 
 

 Key: HADOOP-8200
 URL: https://issues.apache.org/jira/browse/HADOOP-8200
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Attachments: hadoop-8200.txt


 The HADOOP_[JOBTRACKER|TASKTRACKER]_OPTS env variables are no longer in 
 trunk/23 since there's no MR1 implementation and the tests don't use them. 
 This makes the patch for HADOOP-8149 easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8167) Configuration deprecation logic breaks backwards compatibility

2012-03-14 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229312#comment-13229312
 ] 

Tom White commented on HADOOP-8167:
---

#2 seems like the best option in this case. +1 for the patch.

 Configuration deprecation logic breaks backwards compatibility
 --

 Key: HADOOP-8167
 URL: https://issues.apache.org/jira/browse/HADOOP-8167
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 0.23.3

 Attachments: HADOOP-8167.patch


 The deprecated Configuration logic works as follows:
 For a dK deprecated key in favor of nK:
 * on set(dK, V), it stores (nK,V)
 * on get(dK) it does a reverseLookup of dK to nK and looks for get(nK)
 While this works fine for single set/get operations, the iterator() method 
 that returns an iterator of all config key/values, returns only the new keys.
 This breaks applications that did a set(dK, V) and expect, when iterating 
 over the configuration to find (dK, V).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8098) KerberosAuthenticatorHandler should use _HOST replacement to resolve principal name

2012-02-28 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218419#comment-13218419
 ] 

Tom White commented on HADOOP-8098:
---

+1

 KerberosAuthenticatorHandler should use _HOST replacement to resolve 
 principal name
 ---

 Key: HADOOP-8098
 URL: https://issues.apache.org/jira/browse/HADOOP-8098
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 0.24.0, 0.23.2
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.2

 Attachments: HADOOP-8098.patch


 Currently the exact Kerberos principal name has to be set in the 
 configuration of each node.
 KerberosAuthenticatorHandler should do similar logic as the RPC ports to 
 support HTTP/_HOST@REALM

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8082) add hadoop-client and hadoop-minicluster to the dependency-management section

2012-02-16 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209552#comment-13209552
 ] 

Tom White commented on HADOOP-8082:
---

+1

 add hadoop-client and hadoop-minicluster to the dependency-management section
 -

 Key: HADOOP-8082
 URL: https://issues.apache.org/jira/browse/HADOOP-8082
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.2
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.2

 Attachments: HADOOP-8082.patch


 This will allow other Hadoop sub-projects to use those artifacts without 
 having to specify the version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8083) javadoc generation for some modules is not done under target/

2012-02-16 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209733#comment-13209733
 ] 

Tom White commented on HADOOP-8083:
---

${project.build.directory} is the {{target}} directory so you don't need to 
append {{target}} to the end of it.

 javadoc generation for some modules is not done under target/
 -

 Key: HADOOP-8083
 URL: https://issues.apache.org/jira/browse/HADOOP-8083
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.2
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.2

 Attachments: HADOOP-8083.patch


 After running a  clean build/dist in some modules an 'api/' directory shows 
 up at module root level.
 The javadoc plugin should be configured to work under 'target/'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8083) javadoc generation for some modules is not done under target/

2012-02-16 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210071#comment-13210071
 ] 

Tom White commented on HADOOP-8083:
---

+1

 javadoc generation for some modules is not done under target/
 -

 Key: HADOOP-8083
 URL: https://issues.apache.org/jira/browse/HADOOP-8083
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.2
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.2

 Attachments: HADOOP-8083.patch, HADOOP-8083.patch


 After running a  clean build/dist in some modules an 'api/' directory shows 
 up at module root level.
 The javadoc plugin should be configured to work under 'target/'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8074) Small bug in hadoop error message for unknown commands

2012-02-15 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208922#comment-13208922
 ] 

Tom White commented on HADOOP-8074:
---

I think this should be reverted so Daryn's feedback can be addressed. 

 Small bug in hadoop error message for unknown commands
 --

 Key: HADOOP-8074
 URL: https://issues.apache.org/jira/browse/HADOOP-8074
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.24.0
Reporter: Eli Collins
Assignee: Colin Patrick McCabe
Priority: Trivial
 Fix For: 0.23.2

 Attachments: HADOOP-8074.txt


 The hadoop fs command should be more user friendly if the user forgets the 
 dash before the command. Also, this should say cat rather than at.
 {noformat}
 hadoop-0.24.0-SNAPSHOT $ ./bin/hadoop fs cat
 at: Unknown command
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8055) Distribution tar.gz does not contain etc/hadoop/core-site.xml

2012-02-10 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205535#comment-13205535
 ] 

Tom White commented on HADOOP-8055:
---

The patch looks fine to me, but I'm confused by why manually creating 
$HADOOP_HOME/etc/hadoop/core-site.xml avoids the problem - is there some deeper 
issue?

 Distribution tar.gz does not contain etc/hadoop/core-site.xml
 -

 Key: HADOOP-8055
 URL: https://issues.apache.org/jira/browse/HADOOP-8055
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
Reporter: Eric Charles
Assignee: Harsh J
 Attachments: HADOOP-8055.patch, HADOOP-8055.patch


 A dist built from trunk (0.24.0-SNAPSHOT) does not contain a core-site.xml in 
 $HADOOP_HOME/etc/hadoop/ folder.
 $HADOOP_HOME/sbin/start-dfs.sh without that folder gives an exception
 Exception in thread main java.lang.IllegalArgumentException: URI has an 
 authority component
  at java.io.File.init(File.java:368)
  at 
 org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageDirectory(NNStorage.java:310)
  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.init(FSEditLog.java:178)
 ...
 Manually creating $HADOOP_HOME/etc/hadoop/core-site.xml solves this problem 
 and hadoop starts fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8023) Add unset() method to Configuration

2012-02-06 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201441#comment-13201441
 ] 

Tom White commented on HADOOP-8023:
---

+1 This is a compatible addition to the 1.x branch.

 Add unset() method to Configuration
 ---

 Key: HADOOP-8023
 URL: https://issues.apache.org/jira/browse/HADOOP-8023
 Project: Hadoop Common
  Issue Type: New Feature
  Components: conf
Affects Versions: 1.0.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
 Attachments: HADOOP-8023.patch


 HADOOP-7001 introduced the *Configuration.unset(String)* method.
 MAPREDUCE-3727 requires that method in order to be back-ported.
 This is required to fix an issue manifested when running MR/Hive/Sqoop jobs 
 from Oozie, details are in MAPREDUCE-3727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8009) Create hadoop-client and hadoop-minicluster artifacts for downstream projects

2012-02-03 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199844#comment-13199844
 ] 

Tom White commented on HADOOP-8009:
---

+1 for releasing these artifacts for active release branches. There are 
upcoming minor releases for the 1 and 23 branches (and possibly 22?), so it 
would be good to incorporate these artifacts into those releases if possible.

 Create hadoop-client and hadoop-minicluster artifacts for downstream projects 
 --

 Key: HADOOP-8009
 URL: https://issues.apache.org/jira/browse/HADOOP-8009
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.22.0, 0.23.0, 0.24.0, 0.23.1, 1.0.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
 Fix For: 0.23.1

 Attachments: HADOOP-8009-existing-releases.patch, HADOOP-8009.patch


 Using Hadoop from projects like Pig/Hive/Sqoop/Flume/Oozie or any in-house 
 system that interacts with Hadoop is quite challenging for the following 
 reasons:
 * *Different versions of Hadoop produce different artifacts:* Before Hadoop 
 0.23 there was a single artifact hadoop-core, starting with Hadoop 0.23 there 
 are several (common, hdfs, mapred*, yarn*)
 * *There are no 'client' artifacts:* Current artifacts include all JARs 
 needed to run the services, thus bringing into clients several JARs that are 
 not used for job submission/monitoring (servlet, jsp, tomcat, jersey, etc.)
 * *Doing testing on the client side is also quite challenging as more 
 artifacts have to be included than the dependencies define:* for example, the 
 history-server artifact has to be explicitly included. If using Hadoop 1 
 artifacts, jersey-server has to be explicitly included.
 * *3rd party dependencies change in Hadoop from version to version:* This 
 makes things complicated for projects that have to deal with multiple 
 versions of Hadoop as their exclusions list become a huge mix  match of 
 artifacts from different Hadoop versions and it may be break things when a 
 particular version of Hadoop requires a dependency that other version of 
 Hadoop does not require.
 Because of this it would be quite convenient to have the following 
 'aggregator' artifacts:
 * *org.apache.hadoop:hadoop-client* : it includes all required JARs to use 
 Hadoop client APIs (excluding all JARs that are not needed for it)
 * *org.apache.hadoop:hadoop-test* : it includes all required JARs to run 
 Hadoop Mini Clusters
 These aggregator artifacts would be created for current branches under 
 development (trunk, 0.22, 0.23, 1.0) and for released versions that are still 
 in use.
 For branches under development, these artifacts would be generated as part of 
 the build.
 For released versions we would have a a special branch used only as vehicle 
 for publishing the corresponding 'aggregator' artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8003) Make SplitCompressionInputStream an interface instead of an abstract class

2012-01-29 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195856#comment-13195856
 ] 

Tom White commented on HADOOP-8003:
---

I think we should try to do this without breaking compatibility, e.g. by having 
a new SplittableCompressionCodec interface that returns a 
SplittableCompressionInputStream interface in its createInputStream method.

 Make SplitCompressionInputStream an interface instead of an abstract class
 --

 Key: HADOOP-8003
 URL: https://issues.apache.org/jira/browse/HADOOP-8003
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0
Reporter: Tim Broberg

 To be splittable, a codec must extend SplittableCompressionCodec which has a 
 function returning a SplitCompressionInputStream.
 SplitCompressionInputStream is an abstract class which extends 
 CompressionInputStream, the lowest level compression stream class.
 So, no codec that wants to be splittable can reuse any code from 
 DecompressorStream or BlockDecompressorStream.
 You either have to duplicate that code, or not be splittable.
 SplitCompressionInputStream adds just a few very thin functions. Can we make 
 this an interface rather than an abstract class to allow splittable 
 decompression streams to extend DecompressorStream, BlockDecompressorStream, 
 or whatever else we should scheme up in the future?
 To my knowledge, this would impact only the BZip2 codec. None of the other 
 implement this form of splittability yet.
 LineRecordReader looks only at whether the codec is an instance of 
 SplittableCompressionCodec, and then calls the appropriate version of 
 createInputStream. This would not change, so the application code should not 
 have to change, just BZip and SplitCompressionInputStream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7980) API Compatibility between 0.23 and 1.0 in org.apache.hadoop.io.compress.Decompressor

2012-01-18 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188557#comment-13188557
 ] 

Tom White commented on HADOOP-7980:
---

Implementations of custom decompressors can add the new method (but not mark it 
with @Override) and it will work in both 1 and 0.23, I think. Would that solve 
your issue?

 API Compatibility between 0.23 and 1.0 in 
 org.apache.hadoop.io.compress.Decompressor
 

 Key: HADOOP-7980
 URL: https://issues.apache.org/jira/browse/HADOOP-7980
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.23.1
Reporter: Jonathan Eagles

 HADOOP-6835 introduced in org.apache.hadoop.io.compress.Decompressor the 
 public int getRemaining() API. The forces custom decompressors to implement 
 the new API in order to continue to be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7939) Improve Hadoop subcomponent integration in Hadoop 0.23

2012-01-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185384#comment-13185384
 ] 

Tom White commented on HADOOP-7939:
---

 What is the reason to revert symlink resolution code back?

Agreed - the work in HADOOP-7089 came up with a solution for symlink resolution 
which we should continue to use.


 Improve Hadoop subcomponent integration in Hadoop 0.23
 --

 Key: HADOOP-7939
 URL: https://issues.apache.org/jira/browse/HADOOP-7939
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, conf, documentation, scripts
Affects Versions: 0.23.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.1

 Attachments: HADOOP-7939.patch.txt, hadoop-layout.sh


 h1. Introduction
 For the rest of this proposal it is assumed that the current set
 of Hadoop subcomponents is:
  * hadoop-common
  * hadoop-hdfs
  * hadoop-yarn
  * hadoop-mapreduce
 It must be noted that this is an open ended list, though. For example,
 implementations of additional frameworks on top of yarn (e.g. MPI) would
 also be considered a subcomponent.
 h1. Problem statement
 Currently there's an unfortunate coupling and hard-coding present at the
 level of launcher scripts, configuration scripts and Java implementation
 code that prevents us from treating all subcomponents of Hadoop independently
 of each other. In a lot of places it is assumed that bits and pieces
 from individual subcomponents *must* be located at predefined places
 and they can not be dynamically registered/discovered during the runtime.
 This prevents a truly flexible deployment of Hadoop 0.23. 
 h1. Proposal
 NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. 
 The goal here is to keep as much of that layout in place as possible,
 while permitting different deployment layouts.
 The aim of this proposal is to introduce the needed level of indirection and
 flexibility in order to accommodate the current assumed layout of Hadoop 
 tarball
 deployments and all the other styles of deployments as well. To this end the
 following set of environment variables needs to be uniformly used in all of
 the subcomponent's launcher scripts, configuration scripts and Java code
 (SC stands for a literal name of a subcomponent). These variables are
 expected to be defined by SC-env.sh scripts and sourcing those files is
 expected to have the desired effect of setting the environment up correctly.
   # HADOOP_SC_HOME
## root of the subtree in a filesystem where a subcomponent is expected to 
 be installed 
## default value: $0/..
   # HADOOP_SC_JARS 
## a subdirectory with all of the jar files comprising subcomponent's 
 implementation 
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)
   # HADOOP_SC_EXT_JARS
## a subdirectory with all of the jar files needed for extended 
 functionality of the subcomponent (nonessential for correct work of the basic 
 functionality)
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/ext
   # HADOOP_SC_NATIVE_LIBS
## a subdirectory with all the native libraries that component requires
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/native
   # HADOOP_SC_BIN
## a subdirectory with all of the launcher scripts specific to the client 
 side of the component
## default value: $(HADOOP_SC_HOME)/bin
   # HADOOP_SC_SBIN
## a subdirectory with all of the launcher scripts specific to the 
 server/system side of the component
## default value: $(HADOOP_SC_HOME)/sbin
   # HADOOP_SC_LIBEXEC
## a subdirectory with all of the launcher scripts that are internal to 
 the implementation and should *not* be invoked directly
## default value: $(HADOOP_SC_HOME)/libexec
   # HADOOP_SC_CONF
## a subdirectory containing configuration files for a subcomponent
## default value: $(HADOOP_SC_HOME)/conf
   # HADOOP_SC_DATA
## a subtree in the local filesystem for storing component's persistent 
 state
## default value: $(HADOOP_SC_HOME)/data
   # HADOOP_SC_LOG
## a subdirectory for subcomponents's log files to be stored
## default value: $(HADOOP_SC_HOME)/log
   # HADOOP_SC_RUN
## a subdirectory with runtime system specific information
## default value: $(HADOOP_SC_HOME)/run
   # HADOOP_SC_TMP
## a subdirectory with temprorary files
## default value: $(HADOOP_SC_HOME)/tmp

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7920) Remove Avro RPC

2012-01-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185434#comment-13185434
 ] 

Tom White commented on HADOOP-7920:
---

The remaining classes are for AvroSerialization, not for RPC, so they shouldn't 
be removed.

 Remove Avro RPC
 ---

 Key: HADOOP-7920
 URL: https://issues.apache.org/jira/browse/HADOOP-7920
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 0.23.1
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.24.0, 0.23.1

 Attachments: HADOOP-2970.patch, HADOOP-7920.patch, HADOOP-7920.patch, 
 HADOOP-7920.txt


 Please see the discussion in HDFS-2660 for more details. I have created a 
 branch HADOOP-6659 to save the Avro work, if in the future some one wants to 
 use the work that existed to add support for Avro RPC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7934) Normalize dependencies versions across all modules

2012-01-05 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180672#comment-13180672
 ] 

Tom White commented on HADOOP-7934:
---

+1 Thanks for running the tests.

 Normalize dependencies versions across all modules
 --

 Key: HADOOP-7934
 URL: https://issues.apache.org/jira/browse/HADOOP-7934
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
 Fix For: 0.24.0, 0.23.1

 Attachments: HADOOP-7934.patch, HADOOP-7934.patch


 Move all dependencies versions to the dependencyManagement section in the 
 hadoop-project POM
 Move all plugin versions to the dependencyManagement section in the 
 hadoop-project POM

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7937) Forward port SequenceFile#syncFs and friends from Hadoop 1.x

2012-01-05 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180962#comment-13180962
 ] 

Tom White commented on HADOOP-7937:
---

The javac warning is because sync() is deprecated.

 Given that they're just used via DFSClient

Should DFSClient in trunk/23 use these methods, given that the append 
implementation is different to the one in 20? If so then that will need fixing 
in an HDFS JIRA.

I'll go ahead and commit this.

 Forward port SequenceFile#syncFs and friends from Hadoop 1.x
 

 Key: HADOOP-7937
 URL: https://issues.apache.org/jira/browse/HADOOP-7937
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
Affects Versions: 0.22.0, 0.23.1
Reporter: Eli Collins
Assignee: Tom White
  Labels: bigtop
 Attachments: HADOOP-7937.patch


 HDFS-200 added a new public API SequenceFile#syncFs, we need to forward port 
 this for compatibility. Looks like it might have introduced other APIs that 
 need forward porting as well (eg LocaltedBlocks#setFileLength, and 
 DataNode#getBlockInfo).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7934) Normalize dependencies versions across all modules

2012-01-03 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178915#comment-13178915
 ] 

Tom White commented on HADOOP-7934:
---

+1 This looks good to me. Have you been able to run the unit tests successfully?

 Normalize dependencies versions across all modules
 --

 Key: HADOOP-7934
 URL: https://issues.apache.org/jira/browse/HADOOP-7934
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
 Fix For: 0.24.0, 0.23.1

 Attachments: HADOOP-7934.patch


 Move all dependencies versions to the dependencyManagement section in the 
 hadoop-project POM
 Move all plugin versions to the dependencyManagement section in the 
 hadoop-project POM

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7939) Improve Hadoop subcomponent integration in Hadoop 0.23

2011-12-29 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177241#comment-13177241
 ] 

Tom White commented on HADOOP-7939:
---

bq. We already have almost *all* of the proposed variables, the trouble is that 
their naming and usage is very  inconsistent. That's what this proposal was 
aiming at correcting.

A patch that fixes the inconsistencies for 0.23 and trunk would be great to 
have, and is separate from introducing any new environment variables. Would you 
be able to contribute such a patch?

Also, ideally as a user I'd be able to set one HOME environment variable and 
have all the others set to reasonable defaults. I'm not sure how true that is 
in 0.23/trunk today.


 Improve Hadoop subcomponent integration in Hadoop 0.23
 --

 Key: HADOOP-7939
 URL: https://issues.apache.org/jira/browse/HADOOP-7939
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, conf, documentation, scripts
Affects Versions: 0.23.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.1


 h1. Introduction
 For the rest of this proposal it is assumed that the current set
 of Hadoop subcomponents is:
  * hadoop-common
  * hadoop-hdfs
  * hadoop-yarn
  * hadoop-mapreduce
 It must be noted that this is an open ended list, though. For example,
 implementations of additional frameworks on top of yarn (e.g. MPI) would
 also be considered a subcomponent.
 h1. Problem statement
 Currently there's an unfortunate coupling and hard-coding present at the
 level of launcher scripts, configuration scripts and Java implementation
 code that prevents us from treating all subcomponents of Hadoop independently
 of each other. In a lot of places it is assumed that bits and pieces
 from individual subcomponents *must* be located at predefined places
 and they can not be dynamically registered/discovered during the runtime.
 This prevents a truly flexible deployment of Hadoop 0.23. 
 h1. Proposal
 NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. 
 The goal here is to keep as much of that layout in place as possible,
 while permitting different deployment layouts.
 The aim of this proposal is to introduce the needed level of indirection and
 flexibility in order to accommodate the current assumed layout of Hadoop 
 tarball
 deployments and all the other styles of deployments as well. To this end the
 following set of environment variables needs to be uniformly used in all of
 the subcomponent's launcher scripts, configuration scripts and Java code
 (SC stands for a literal name of a subcomponent). These variables are
 expected to be defined by SC-env.sh scripts and sourcing those files is
 expected to have the desired effect of setting the environment up correctly.
   # HADOOP_SC_HOME
## root of the subtree in a filesystem where a subcomponent is expected to 
 be installed 
## default value: $0/..
   # HADOOP_SC_JARS 
## a subdirectory with all of the jar files comprising subcomponent's 
 implementation 
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)
   # HADOOP_SC_EXT_JARS
## a subdirectory with all of the jar files needed for extended 
 functionality of the subcomponent (nonessential for correct work of the basic 
 functionality)
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/ext
   # HADOOP_SC_NATIVE_LIBS
## a subdirectory with all the native libraries that component requires
## default value: $(HADOOP_SC_HOME)/share/hadoop/$(SC)/native
   # HADOOP_SC_BIN
## a subdirectory with all of the launcher scripts specific to the client 
 side of the component
## default value: $(HADOOP_SC_HOME)/bin
   # HADOOP_SC_SBIN
## a subdirectory with all of the launcher scripts specific to the 
 server/system side of the component
## default value: $(HADOOP_SC_HOME)/sbin
   # HADOOP_SC_LIBEXEC
## a subdirectory with all of the launcher scripts that are internal to 
 the implementation and should *not* be invoked directly
## default value: $(HADOOP_SC_HOME)/libexec
   # HADOOP_SC_CONF
## a subdirectory containing configuration files for a subcomponent
## default value: $(HADOOP_SC_HOME)/conf
   # HADOOP_SC_DATA
## a subtree in the local filesystem for storing component's persistent 
 state
## default value: $(HADOOP_SC_HOME)/data
   # HADOOP_SC_LOG
## a subdirectory for subcomponents's log files to be stored
## default value: $(HADOOP_SC_HOME)/log
   # HADOOP_SC_RUN
## a subdirectory with runtime system specific information
## default value: $(HADOOP_SC_HOME)/run
   # HADOOP_SC_TMP
## a subdirectory with temprorary files
## default value: $(HADOOP_SC_HOME)/tmp

--
This message is automatically generated by JIRA.

[jira] [Commented] (HADOOP-7732) hadoop java docs bad pointer to hdfs package

2011-12-28 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176718#comment-13176718
 ] 

Tom White commented on HADOOP-7732:
---

Hi Matt, there's some more background at MAPREDUCE-559, which removed the 
public facing HDFS javadoc and added the javadoc-dev ant target for 
developers. In trunk the HDFS classes are marked @Private, so they would not 
show up in javadoc anyway, since we use the 
ExcludePrivateAnnotationsStandardDoclet when generating it.

I wonder if the link just needs to be to the HDFS documentation? E.g. 
http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html.

 hadoop java docs bad pointer to hdfs package
 

 Key: HADOOP-7732
 URL: https://issues.apache.org/jira/browse/HADOOP-7732
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.204.0, 0.20.205.0
Reporter: Arpit Gupta
Assignee: Matt Foley
Priority: Minor
 Attachments: HADOOP-7732.patch


 the following link 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/hdfs/package-summary.html
 leads to a 404

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7732) hadoop java docs bad pointer to hdfs package

2011-12-27 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176312#comment-13176312
 ] 

Tom White commented on HADOOP-7732:
---

Since HDFS is accessed by users through FileSystem (in Common), we haven't 
published javadoc for it in the past since its not a part of the public API. 
There is an option to generate HDFS javadoc that developers can use locally.

 hadoop java docs bad pointer to hdfs package
 

 Key: HADOOP-7732
 URL: https://issues.apache.org/jira/browse/HADOOP-7732
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.204.0, 0.20.205.0
Reporter: Arpit Gupta
Assignee: Matt Foley
Priority: Minor
 Attachments: HADOOP-7732.patch


 the following link 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/hdfs/package-summary.html
 leads to a 404

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7912) test-patch should run eclipse:eclipse to verify that it does not break again

2011-12-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167683#comment-13167683
 ] 

Tom White commented on HADOOP-7912:
---

Apart from a typo (checkEclpiseGeneration) this looks good. What testing did 
you do?

 test-patch should run eclipse:eclipse to verify that it does not break again
 

 Key: HADOOP-7912
 URL: https://issues.apache.org/jira/browse/HADOOP-7912
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HADOOP-7912.txt


 Recently the eclipse:eclipse build was broken.  If we are going to document 
 this on the wiki and have many developers use it we should verify that it 
 always works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7884) test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.

2011-12-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167676#comment-13167676
 ] 

Tom White commented on HADOOP-7884:
---

The test-patch builds run with the base directory set to 
hadoop-{common,hdfs,mapreduce}-project. We could change common so it is run 
from the top-level, thus allowing cross-project patches, but it would mean that 
i) unit tests for every project are run for every change, and ii) only patches 
relative to the top-level would apply. i) is probably a good thing (changes to 
common should result in HDFS and MapReduce tests being run). For ii), do folks 
generally produce patches from the top-level anyway?

 test-patch seems to fail when a patch goes across projects 
 (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.
 

 Key: HADOOP-7884
 URL: https://issues.apache.org/jira/browse/HADOOP-7884
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
Reporter: Alejandro Abdelnur
 Fix For: 0.24.0


 Take for example HDFS-2178, the patch applies cleanly, but test-patch fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7884) test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.

2011-12-12 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167773#comment-13167773
 ] 

Tom White commented on HADOOP-7884:
---

As things stand, test-patch will always do a build from the top-level to check 
that there are no breakages, but tests are run from 
hadoop-{common,hdfs,mapreduce}-project so only tests from the relevant project 
are run.

Re: nightly builds, there is one for each project that does a full test-run and 
build: https://builds.apache.org/job/Hadoop-Common-trunk/, 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/, 
https://builds.apache.org/job/Hadoop-Mapreduce-trunk/. Perhaps we need a 
top-level one.

 test-patch seems to fail when a patch goes across projects 
 (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.
 

 Key: HADOOP-7884
 URL: https://issues.apache.org/jira/browse/HADOOP-7884
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
Reporter: Alejandro Abdelnur
 Fix For: 0.24.0


 Take for example HDFS-2178, the patch applies cleanly, but test-patch fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7899) Generate proto java files as part of the build

2011-12-09 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166500#comment-13166500
 ] 

Tom White commented on HADOOP-7899:
---

+1

 Generate proto java files as part of the build
 --

 Key: HADOOP-7899
 URL: https://issues.apache.org/jira/browse/HADOOP-7899
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.24.0, 0.23.1

 Attachments: HADOOP-7899v1.patch, HADOOP-7899v1.sh


 currently the generated java files are precompiled and checked in into the 
 source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7810) move hadoop archive to core from tools

2011-12-09 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166711#comment-13166711
 ] 

Tom White commented on HADOOP-7810:
---

+1 for the trunk patch.

 move hadoop archive to core from tools
 --

 Key: HADOOP-7810
 URL: https://issues.apache.org/jira/browse/HADOOP-7810
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.205.0, 0.24.0, 0.23.1, 1.0.0
Reporter: John George
Assignee: John George
Priority: Blocker
 Attachments: HADOOP-7810v1-trunk.patch, HADOOP-7810v1-trunk.sh, 
 hadoop-7810.branch-0.20-security.patch, 
 hadoop-7810.branch-0.20-security.patch, hadoop-7810.branch-0.20-security.patch


 The HadoopArchieves classes are included in the 
 $HADOOP_HOME/hadoop_tools.jar, but this file is not found in `hadoop 
 classpath`.
 A Pig script using HCatalog's dynamic partitioning with HAR enabled will 
 therefore fail if a jar with HAR is not included in the pig call's '-cp' and 
 '-Dpig.additional.jars' arguments.
 I am not aware of any reason to not include hadoop-tools.jar in 'hadoop 
 classpath'. Will attach a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7738) Document incompatible API changes between 0.20.20x and 0.23.0 release

2011-12-06 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163731#comment-13163731
 ] 

Tom White commented on HADOOP-7738:
---

Here are some notes on the differences I have found so far between the 0.20.x 
release series and 0.23.0 which break compatibility in some way.

* MAPREDUCE-954 changed some context classes to interfaces (e.g. JobContext, 
MapContext, TaskAttemptContext, TaskInputOutputContext). This change should not 
impact user code (since such code doesn't implement these interfaces) although 
it does mean that user code (including libraries like Pig) will need to be 
recompiled. See note (1) at 
http://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_packages
* MAPREDUCE-901 changes Counter from a class to an interface. Clients need to 
recompile.
* HADOOP-6201 changed FileSystem#listStatus to throw FileNotFoundException when 
the file is not found, rather than returning null. Clients need to review usage 
of this method and update their code to handle this case. 


 Document incompatible API changes between 0.20.20x and 0.23.0 release
 -

 Key: HADOOP-7738
 URL: https://issues.apache.org/jira/browse/HADOOP-7738
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.23.1

 Attachments: apicheck-hadoop-0.20.204.0-0.24.0-SNAPSHOT.txt


 0.20.20x to 0.23.0 will be a common upgrade path, so we should document any 
 incompatible API changes that will affect users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir

2011-12-01 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161071#comment-13161071
 ] 

Tom White commented on HADOOP-7874:
---

Sounds reasonable. What testing did you do on the current patch?

 native libs should be under lib/native/ dir
 ---

 Key: HADOOP-7874
 URL: https://issues.apache.org/jira/browse/HADOOP-7874
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
  Labels: bigtop
 Attachments: HADOOP-7874.patch


 Currently common and hdfs SO files end up under lib/ dir with all JARs, they 
 should end up under lib/native.
 In addition, the hadoop-config.sh script needs some cleanup when comes to 
 native lib handling:
 * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it 
 should use lib/native.
 * it is looking for build/lib/native, this is from the old ant build, not 
 applicable anymore.
 * it is looking for the libhdfs.a and adding to the java.librar.path, this is 
 not correct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir

2011-12-01 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161260#comment-13161260
 ] 

Tom White commented on HADOOP-7874:
---

+1

 native libs should be under lib/native/ dir
 ---

 Key: HADOOP-7874
 URL: https://issues.apache.org/jira/browse/HADOOP-7874
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
  Labels: bigtop
 Attachments: HADOOP-7874.patch


 Currently common and hdfs SO files end up under lib/ dir with all JARs, they 
 should end up under lib/native.
 In addition, the hadoop-config.sh script needs some cleanup when comes to 
 native lib handling:
 * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it 
 should use lib/native.
 * it is looking for build/lib/native, this is from the old ant build, not 
 applicable anymore.
 * it is looking for the libhdfs.a and adding to the java.librar.path, this is 
 not correct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7874) native libs should be under lib/native/ dir

2011-11-30 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160509#comment-13160509
 ] 

Tom White commented on HADOOP-7874:
---

 it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it 
 should use lib/native

Why has the arch directory been dropped?

 native libs should be under lib/native/ dir
 ---

 Key: HADOOP-7874
 URL: https://issues.apache.org/jira/browse/HADOOP-7874
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, 0.23.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
  Labels: bigtop
 Attachments: HADOOP-7874.patch


 Currently common and hdfs SO files end up under lib/ dir with all JARs, they 
 should end up under lib/native.
 In addition, the hadoop-config.sh script needs some cleanup when comes to 
 native lib handling:
 * it is using lib/native/${JAVA_PLATFORM} for the java.library.path, when it 
 should use lib/native.
 * it is looking for build/lib/native, this is from the old ant build, not 
 applicable anymore.
 * it is looking for the libhdfs.a and adding to the java.librar.path, this is 
 not correct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples

2011-11-22 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155295#comment-13155295
 ] 

Tom White commented on HADOOP-7590:
---

Arun, I don't feel strongly where streaming lives. 

Alejandro, was there a reason streaming can't go in hadoop-mapreduce-project?

 Mavenize streaming and MR examples
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.24.0, 0.23.1

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, 
 HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh, 
 HADOOP-7590v8.patch, HADOOP-7590v8.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples

2011-11-18 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152972#comment-13152972
 ] 

Tom White commented on HADOOP-7590:
---

I don't disagree with this assessment, but can't the work to make streaming 
work from the command line be done in this JIRA?

Regarding the rant: is there a JIRA to fix this?

 Mavenize streaming and MR examples
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.1

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, 
 HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples

2011-11-18 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153255#comment-13153255
 ] 

Tom White commented on HADOOP-7590:
---

Looks good to me, +1.

I think it's OK to fix the assembly part in a follow-up as long as it gets into 
0.23.1.

 Mavenize streaming and MR examples
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.1

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, 
 HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh, 
 HADOOP-7590v8.patch, HADOOP-7590v8.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples

2011-11-17 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152387#comment-13152387
 ] 

Tom White commented on HADOOP-7590:
---

I think the tests are relevant as they act as regression tests.

* TestStreamingBadRecords, TestStreamingStatus and MiniMRClientClusterFactory 
are failing for me with a NPE caused by the new MiniMRClientClusterFactory. Can 
you check this please?
* TestStreamingCombiner is failing because the counter can't be found. This 
could be a problem in MR on YARN, so it would be OK to look at this separately.

 Mavenize streaming and MR examples
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.1

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, 
 HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize streaming and MR examples

2011-11-17 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152660#comment-13152660
 ] 

Tom White commented on HADOOP-7590:
---

 How about opening a JIRA for these failing tests and mark it as a blocker?

Sounds reasonable to me.

 Still missing, but it should be also another JIRA, is how streaming JAR gets 
 in the classpath of the shell command.

Do you mean that with this patch you can't run streaming jobs from the command 
line? Currently it's possible (albeit tricky since it requires mvn and ant) to 
build a distribution that supports streaming jobs, so we should continue to 
support that in this patch.

 Mavenize streaming and MR examples
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.1

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh, 
 HADOOP-7590v6.patch, HADOOP-7590v6.sh, HADOOP-7590v7.patch, HADOOP-7590v7.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7809) Backport HADOOP-5839 to 0.20-security - fixes to ec2 scripts to allow remote job submission

2011-11-08 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146779#comment-13146779
 ] 

Tom White commented on HADOOP-7809:
---

Joydeep, have you looked at Whirr (http://whirr.apache.org/)? The EC2 scripts 
in Hadoop were deprecated in favour of Whirr well over a year ago.

 Backport HADOOP-5839 to 0.20-security - fixes to ec2 scripts to allow remote 
 job submission
 ---

 Key: HADOOP-7809
 URL: https://issues.apache.org/jira/browse/HADOOP-7809
 Project: Hadoop Common
  Issue Type: Improvement
  Components: contrib/cloud
Reporter: Joydeep Sen Sarma
Assignee: Matt Foley
 Attachments: hadoop-5839.2.patch


 The fix for HADOOP-5839 was committed to 0.21 more than a year ago.  This bug 
 is to backport the change (which is only 14 lines) to branch-0.20-security.
 ===
 Original description:
 i would very much like the option of submitting jobs from a workstation 
 outside ec2 to a hadoop cluster in ec2. This has been explored here:
 http://www.nabble.com/public-IP-for-datanode-on-EC2-tt19336240.html
 the net result of this is that we can make this work (along with using a 
 socks proxy) with a couple of changes in the ec2 scripts:
 a) use public 'hostname' for fs.default.name setting (instead of the private 
 hostname being used currently)
 b) mark hadoop.rpc.socket.factory.class.default as final variable in the 
 generated hadoop-site.xml (that applies to server side)
 #a has no downside as far as i can tell since public hostnames resolve to 
 internal/private IP addresses within ec2 (so traffic is optimally routed).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7801) HADOOP_PREFIX cannot be overriden

2011-11-07 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145737#comment-13145737
 ] 

Tom White commented on HADOOP-7801:
---

+1 This is a reasonable change.

Nit: the change to fuse_dfs_wrapper.sh is not needed since HADOOP_PREFIX is 
only assigned if it empty or unset. 

 HADOOP_PREFIX cannot be overriden
 -

 Key: HADOOP-7801
 URL: https://issues.apache.org/jira/browse/HADOOP-7801
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
Reporter: Bruno Mahé
 Attachments: HADOOP-7801-2.patch, HADOOP-7801.patch


 hadoop-config.sh forces HADOOP_prefix to a specific value:
 export HADOOP_PREFIX=`dirname $this`/..
 It would be nice to make this overridable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7802) hadoop script unconditionnaly source $bin/../libexec/hadoop-config.sh

2011-11-07 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145739#comment-13145739
 ] 

Tom White commented on HADOOP-7802:
---

+1

 hadoop script unconditionnaly source $bin/../libexec/hadoop-config.sh
 ---

 Key: HADOOP-7802
 URL: https://issues.apache.org/jira/browse/HADOOP-7802
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Bruno Mahé
 Attachments: HADOOP-7802-2.patch, HADOOP-7802.patch


 It would be nice to be able to specify some other location for 
 hadoop-config.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7787) binary and source tarball names are not consistent

2011-11-07 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145747#comment-13145747
 ] 

Tom White commented on HADOOP-7787:
---

I would suggest hadoop-${project.version}.tar.gz for the binary and 
hadoop-${project.version}-src.tar.gz for the source. We haven't used a -bin 
suffix in Hadoop before, and this is consistent with the 0.23.0 release 
candidate Arun created last week 
(http://people.apache.org/~acmurthy/hadoop-0.23.0-rc2/).

 binary and source tarball names are not consistent
 --

 Key: HADOOP-7787
 URL: https://issues.apache.org/jira/browse/HADOOP-7787
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
Reporter: Bruno Mahé
 Attachments: HADOOP-7787.patch


 When building binary and source tarballs, I get the following artifacts:
 Binary tarball: hadoop-0.23.0-SNAPSHOT.tar.gz 
 Source tarball: hadoop-dist-0.23.0-SNAPSHOT-src.tar.gz
 Notice the -dist right between hadoop and the version in the source 
 tarball name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7590) Mavenize the MR1 JARs (main and test) creation

2011-11-01 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141305#comment-13141305
 ] 

Tom White commented on HADOOP-7590:
---

Alejandro, the problem you are seeing won't be a problem with MAPREDUCE-3169 
(since the tests will run under YARN, not MR1), so perhaps you could make that 
issue a dependency?

 Mavenize the MR1 JARs (main and test) creation
 --

 Key: HADOOP-7590
 URL: https://issues.apache.org/jira/browse/HADOOP-7590
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.24.0

 Attachments: HADOOP-7590v1.patch, HADOOP-7590v1.sh, 
 HADOOP-7590v2.patch, HADOOP-7590v2.sh, HADOOP-7590v3.patch, HADOOP-7590v3.sh, 
 HADOOP-7590v4.patch, HADOOP-7590v4.sh, HADOOP-7590v5.patch, HADOOP-7590v5.sh


 MR1 code is still available in MR2 for testing contribs.
 While this is a temporary until contribs tests are ported to MR2.
 As a follow up the contrib projects themselves should be mavenized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7785) Add equals, hashcode, toString to DataChecksum

2011-10-31 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140941#comment-13140941
 ] 

Tom White commented on HADOOP-7785:
---

+1

 Add equals, hashcode, toString to DataChecksum
 --

 Key: HADOOP-7785
 URL: https://issues.apache.org/jira/browse/HADOOP-7785
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io, util
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-7785.txt


 Simple patch to add these functions to the DataChecksum interface. This is 
 handy for the sake of HDFS-2130.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7782) Add a way to merge javadocs?

2011-10-29 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139442#comment-13139442
 ] 

Tom White commented on HADOOP-7782:
---

The short answer is mvn javadoc:aggregate -Dmaxmemory=1024. However, we 
should only publish the public API, which means we should use the 
annotation-aware doclet, and we should group the API by project using the 
groups parameter. 

More at 
http://maven.apache.org/plugins/maven-javadoc-plugin/examples/aggregate.html 
and http://maven.apache.org/plugins/maven-javadoc-plugin/aggregate-mojo.html

I can look at this one if you like.


 Add a way to merge javadocs?
 

 Key: HADOOP-7782
 URL: https://issues.apache.org/jira/browse/HADOOP-7782
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Arun C Murthy
Priority: Critical

 With 'mvn javadoc:javadoc' we now get docs spread over the maven modules. Is 
 there a way to stich them all together?
 Also, there are some differences in their generation: hadoop-auth and 
 hadoop-yarn-* hadoop-mapreduce-* modules goto a top-level apidocs dir which 
 isn't the case for hadoop-common and hadoop-hdfs - they go straight to 
 target/site/api.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7741) Maven related JIRAs to backport to 0.23

2011-10-26 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13136056#comment-13136056
 ] 

Tom White commented on HADOOP-7741:
---

HADOOP-7763

 Maven related JIRAs to backport to 0.23
 ---

 Key: HADOOP-7741
 URL: https://issues.apache.org/jira/browse/HADOOP-7741
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Affects Versions: 0.23.0
Reporter: Alejandro Abdelnur
 Fix For: 0.23.0


 HADOOP-7624
 HDFS-2294
 MAPREDUCE-3014
 HDFS-2322
 HADOOP-7642
 MAPREDUCE-3171
 HADOOP-7737
 MAPREDUCE-3177
 MAPREDUCE-3003
 HADOOP-7590
 MAPREDUCE-3024
 HADOOP-7538

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7741) Maven related JIRAs to backport to 0.23

2011-10-26 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13136618#comment-13136618
 ] 

Tom White commented on HADOOP-7741:
---

HADOOP-7768

 Maven related JIRAs to backport to 0.23
 ---

 Key: HADOOP-7741
 URL: https://issues.apache.org/jira/browse/HADOOP-7741
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Affects Versions: 0.23.0
Reporter: Alejandro Abdelnur
 Fix For: 0.23.0


 HADOOP-7624
 HDFS-2294
 MAPREDUCE-3014
 HDFS-2322
 HADOOP-7642
 MAPREDUCE-3171
 HADOOP-7737
 MAPREDUCE-3177
 MAPREDUCE-3003
 HADOOP-7590
 MAPREDUCE-3024
 HADOOP-7538

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7758) Make GlobFilter class public

2011-10-19 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131134#comment-13131134
 ] 

Tom White commented on HADOOP-7758:
---

+1 this looks better. Add some class-level javadoc too?

 Make GlobFilter class public
 

 Key: HADOOP-7758
 URL: https://issues.apache.org/jira/browse/HADOOP-7758
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.0, 0.24.0

 Attachments: HDFS-2474.patch, HDFS-2474.patch


 Currently the GlobFilter class is package private.
 As a generic filter it is quite useful (and I've found myself doing cutpaste 
 of it a few times)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7755) Detect MapReduce PreCommit Trunk builds silently failing when running test-patch.sh

2011-10-18 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130017#comment-13130017
 ] 

Tom White commented on HADOOP-7755:
---

+1

 Detect MapReduce PreCommit Trunk builds silently failing when running 
 test-patch.sh
 ---

 Key: HADOOP-7755
 URL: https://issues.apache.org/jira/browse/HADOOP-7755
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Blocker
 Attachments: HADOOP-7755.patch


 MapReduce PreCommit build is silently failing only running a very small 
 portion of tests. The build then errors out, yet +1 it given to the patch.
 Last known Success build - 307 tests run and passed
 https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/990/testReport/
 First known Error build - 69 tests run and passed
 https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/994/testReport/
 Snippet from failed build log - Errors out and then +1 the patch
 https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/994/console
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] hadoop-yarn-api ... SUCCESS [19.512s]
 [INFO] hadoop-yarn-common  FAILURE [13.835s]
 [INFO] hadoop-yarn-server-common . SKIPPED
 [INFO] hadoop-yarn-server-nodemanager  SKIPPED
 [INFO] hadoop-yarn-server-resourcemanager  SKIPPED
 [INFO] hadoop-yarn-server-tests .. SKIPPED
 [INFO] hadoop-yarn-server  SKIPPED
 [INFO] hadoop-yarn-applications-distributedshell . SKIPPED
 [INFO] hadoop-yarn-applications .. SKIPPED
 [INFO] hadoop-yarn-site .. SKIPPED
 [INFO] hadoop-yarn ... SKIPPED
 [INFO] hadoop-mapreduce-client-core .. SKIPPED
 [INFO] hadoop-mapreduce-client-common  SKIPPED
 [INFO] hadoop-mapreduce-client-shuffle ... SKIPPED
 [INFO] hadoop-mapreduce-client-app ... SKIPPED
 [INFO] hadoop-mapreduce-client-hs  SKIPPED
 [INFO] hadoop-mapreduce-client-jobclient . SKIPPED
 [INFO] hadoop-mapreduce-client ... SKIPPED
 [INFO] hadoop-mapreduce .. SKIPPED
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 33.784s
 [INFO] Finished at: Wed Oct 12 12:03:22 UTC 2011
 [INFO] Final Memory: 40M/630M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-assembly-plugin:2.2-beta-5:single (tar) on 
 project hadoop-yarn-common: Failed to create assembly: Error adding file 
 'org.apache.hadoop:hadoop-yarn-api:jar:0.24.0-SNAPSHOT' to archive: 
 /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/target/classes
  isn't a file. - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 [ERROR] 
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :hadoop-yarn-common
 ==
 ==
 Running contrib tests.
 ==
 ==
 /bin/kill -9 27543 
 kill: No such process
 NOP
 ==
 ==
 Checking the integrity of system test framework code.
 ==
 ==
 /bin/kill -9 27548 
 kill: No such process
 NOP
 +1 overall.  Here are the results of testing the latest attachment 
 

[jira] [Commented] (HADOOP-7737) normalize hadoop-mapreduce hadoop-dist dist/tar build with common/hdfs

2011-10-13 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126880#comment-13126880
 ] 

Tom White commented on HADOOP-7737:
---

+1 I tried with and without -Dtar. I managed to run a MR job from the 
exploded directory.

{noformat}
export HADOOP_COMMON_HOME=$(pwd)/$(ls -d 
hadoop-common-project/hadoop-common/target/hadoop-common-*-SNAPSHOT)
export HADOOP_HDFS_HOME=$(pwd)/$(ls -d 
hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*-SNAPSHOT)
export HADOOP_MAPRED_HOME=$(pwd)/$(ls -d 
hadoop-mapreduce-project/target/hadoop-mapreduce-*-SNAPSHOT)
export YARN_HOME=$HADOOP_MAPRED_HOME

export 
PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$HADOOP_MAPRED_HOME/bin:$PATH

cat  $YARN_HOME/conf/yarn-site.xml   EOF
?xml version=1.0?
configuration
!-- Site specific YARN configuration properties --
property
  nameyarn.nodemanager.aux-services/name
  valuemapreduce.shuffle/value
/property
property
  nameyarn.nodemanager.aux-services.mapreduce.shuffle.class/name
  valueorg.apache.hadoop.mapred.ShuffleHandler/value
/property
/configuration
EOF

cd hadoop-mapreduce-project
ant examples -Dresolvers=internal
cd ..
export HADOOP_CLASSPATH=$YARN_HOME/modules/*
mkdir in
cp BUILDING.txt in/
hadoop jar 
hadoop-mapreduce-project/build/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
wordcount -Dmapreduce.job.user.name=$USER in out
{noformat}

I'll add this to 
http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment after it's 
committed.

 normalize hadoop-mapreduce  hadoop-dist dist/tar build with common/hdfs
 

 Key: HADOOP-7737
 URL: https://issues.apache.org/jira/browse/HADOOP-7737
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.0, 0.24.0

 Attachments: HADOOP-7737.patch, HADOOP-7737.patch


 Normalize the build fo hadoop-mapreduce and hadoop-dist with hadoop-common 
 and hadoop-hdfs making the -Pdist and -Dtar maven options to be consistent.
 * -Pdist should create the layout
 * -Dtar should create the TAR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7738) Document incompatible API changes between 0.20.20x and 0.23.0 release

2011-10-13 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126984#comment-13126984
 ] 

Tom White commented on HADOOP-7738:
---

I would like to eliminate the false positives (e.g. by excluding them from 
SigTest), and (time permitting) go through the remaining ones so they can 
either be fixed if possible, or documented in release notes.

 Document incompatible API changes between 0.20.20x and 0.23.0 release
 -

 Key: HADOOP-7738
 URL: https://issues.apache.org/jira/browse/HADOOP-7738
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.23.0

 Attachments: apicheck-hadoop-0.20.204.0-0.24.0-SNAPSHOT.txt


 0.20.20x to 0.23.0 will be a common upgrade path, so we should document any 
 incompatible API changes that will affect users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7743) Add Maven profile to create a full source tarball

2011-10-13 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127014#comment-13127014
 ] 

Tom White commented on HADOOP-7743:
---

+1

 Add Maven profile to create a full source tarball
 -

 Key: HADOOP-7743
 URL: https://issues.apache.org/jira/browse/HADOOP-7743
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.23.0, 0.24.0

 Attachments: HADOOP-7743.patch, HADOOP-7743.patch


 Currently we are building binary distributions only.
 We should also build a full source distribution from where Hadoop can be 
 built.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7035) Document incompatible API changes between releases

2011-09-29 Thread Tom White (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117776#comment-13117776
 ] 

Tom White commented on HADOOP-7035:
---

 1. Is it possible that the tool needs to be applied the other way around, 
 that is having 0.22 as the base and Tested version being 0.21?

The method signature changed, which is reported as the method being removed. In 
0.21 it was
{code}
protected SequenceFile.Reader createDataFileReader(FileSystem fs,
Path dataFile, Configuration conf)
{code}

And in 0.22 it is

{code}
protected SequenceFile.Reader 
createDataFileReader(Path dataFile, Configuration conf,
 SequenceFile.Reader.Option... options)
{code}

 2. Did you run the tool against MR only? Hard to believe there were no API 
 changes in HDFS and common.

I ran it against all three. HDFS is marked as @Private, so it won't show up in 
the report.

 3. What is the final goal of this jira. Is it to identify incompatible 
 changes and make a patch for site with the release notes?

Yes, including it in the release notes would be a good start.

 If so we can filter out non public changes from the reports generated by 
 SigTest and probably those that do not belong to public APIs in turns of 
 Hadoop annotations, if it makes sense.

The script already uses the annotations to restrict the changes to the public 
API.

 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.22.0

 Attachments: apicheck-hadoop-0.20.203.0-0.20.204.0.txt, 
 apicheck-hadoop-0.21.0-0.22.0-SNAPSHOT.txt, jdiff-with-previous-release.sh, 
 jdiff-with-previous-release.sh


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira