[jira] Updated: (HADOOP-7183) WritableComparator.get should not cache comparator objects

2011-03-11 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7183:
--

Attachment: HADOOP-7183.patch

The problem is that WritableComparator has a mutable field - DataInputBuffer 
buffer - which is only used by WritableComparable implementations that *don't* 
override the optimized binary compare method. IntWritable, Text, etc override 
this method, so there is no thread safety issue for these.

The remedy is to only register comparators explicitly, i.e. not the generic 
ones, since they may not be thread-safe. This is actually the behaviour that 
was in place before HADOOP-6881.

I've also updated the javadoc for WritableComparator.define to clarify that it 
should only be called for thread-safe classes.


 WritableComparator.get should not cache comparator objects
 --

 Key: HADOOP-7183
 URL: https://issues.apache.org/jira/browse/HADOOP-7183
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Blocker
 Fix For: 0.20.3, 0.21.1, 0.22.0

 Attachments: HADOOP-7183.patch


 HADOOP-6881 modified WritableComparator.get such that the constructed 
 WritableComparator gets saved back into the static map. This is fine for 
 stateless comparators, but some comparators have per-instance state, and thus 
 this becomes thread-unsafe and causes errors in the shuffle where multiple 
 threads are doing comparisons. An example of a Comparator with per-instance 
 state is WritableComparator itself.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Assigned: (HADOOP-7183) WritableComparator.get should not cache comparator objects

2011-03-11 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reassigned HADOOP-7183:
-

Assignee: Tom White

 WritableComparator.get should not cache comparator objects
 --

 Key: HADOOP-7183
 URL: https://issues.apache.org/jira/browse/HADOOP-7183
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Tom White
Priority: Blocker
 Fix For: 0.20.3, 0.21.1, 0.22.0

 Attachments: HADOOP-7183.patch


 HADOOP-6881 modified WritableComparator.get such that the constructed 
 WritableComparator gets saved back into the static map. This is fine for 
 stateless comparators, but some comparators have per-instance state, and thus 
 this becomes thread-unsafe and causes errors in the shuffle where multiple 
 threads are doing comparisons. An example of a Comparator with per-instance 
 state is WritableComparator itself.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-7183) WritableComparator.get should not cache comparator objects

2011-03-11 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7183:
--

Status: Patch Available  (was: Open)

 WritableComparator.get should not cache comparator objects
 --

 Key: HADOOP-7183
 URL: https://issues.apache.org/jira/browse/HADOOP-7183
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Tom White
Priority: Blocker
 Fix For: 0.20.3, 0.21.1, 0.22.0

 Attachments: HADOOP-7183.patch


 HADOOP-6881 modified WritableComparator.get such that the constructed 
 WritableComparator gets saved back into the static map. This is fine for 
 stateless comparators, but some comparators have per-instance state, and thus 
 this becomes thread-unsafe and causes errors in the shuffle where multiple 
 threads are doing comparisons. An example of a Comparator with per-instance 
 state is WritableComparator itself.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-7154) Should set MALLOC_ARENA_MAX in hadoop-env.sh

2011-03-08 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004231#comment-13004231
 ] 

Tom White commented on HADOOP-7154:
---

+1

 Should set MALLOC_ARENA_MAX in hadoop-env.sh
 

 Key: HADOOP-7154
 URL: https://issues.apache.org/jira/browse/HADOOP-7154
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: hadoop-7154.txt


 New versions of glibc present in RHEL6 include a new arena allocator design. 
 In several clusters we've seen this new allocator cause huge amounts of 
 virtual memory to be used, since when multiple threads perform allocations, 
 they each get their own memory arena. On a 64-bit system, these arenas are 
 64M mappings, and the maximum number of arenas is 8 times the number of 
 cores. We've observed a DN process using 14GB of vmem for only 300M of 
 resident set. This causes all kinds of nasty issues for obvious reasons.
 Setting MALLOC_ARENA_MAX to a low number will restrict the number of memory 
 arenas and bound the virtual memory, with no noticeable downside in 
 performance - we've been recommending MALLOC_ARENA_MAX=4. We should set this 
 in hadoop-env.sh to avoid this issue as RHEL6 becomes more and more common.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-7166) DaemonFactory should be moved from HDFS to common

2011-03-07 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003656#comment-13003656
 ] 

Tom White commented on HADOOP-7166:
---

Is the motivation for this for MapReduce to use DaemonFactory?

 DaemonFactory should be moved from HDFS to common
 -

 Key: HADOOP-7166
 URL: https://issues.apache.org/jira/browse/HADOOP-7166
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HADOOP-7166.1.patch


 DaemonFactory class is defined in hdfs util. common would be a better place 
 for this class.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-7116) raise contrib junit test jvm memory size to 512mb

2011-03-03 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002308#comment-13002308
 ] 

Tom White commented on HADOOP-7116:
---

Looks like this was committed to the 0.20 branch. Can it be closed?

 raise contrib junit test jvm memory size to 512mb
 -

 Key: HADOOP-7116
 URL: https://issues.apache.org/jira/browse/HADOOP-7116
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.20.3

 Attachments: h-7116.patch


 The streaming tests are failing with out of memory. Raise the memory limit to 
 512mb.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6370) Contrib project ivy dependencies are not included in binary target

2011-03-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6370:
--

Status: Open  (was: Patch Available)

Unfortunately this has fallen out of date. Aaron, would you like to regenerate 
it? Thanks.

 Contrib project ivy dependencies are not included in binary target
 --

 Key: HADOOP-6370
 URL: https://issues.apache.org/jira/browse/HADOOP-6370
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Critical
 Attachments: HADOOP-6370.2.patch, HADOOP-6370.patch


 Only Hadoop's own library dependencies are promoted to ${build.dir}/lib; any 
 libraries required by contribs are not redistributed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7139) Allow appending to existing SequenceFiles

2011-03-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7139:
--

Assignee: Mac Yang
  Status: Open  (was: Patch Available)

 Allow appending to existing SequenceFiles
 -

 Key: HADOOP-7139
 URL: https://issues.apache.org/jira/browse/HADOOP-7139
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.20.1
Reporter: Stephen Rose
Assignee: Mac Yang
Priority: Minor
 Fix For: 0.23.0

 Attachments: HADOOP-7139.patch, HADOOP-7139.patch, HADOOP-7139.patch, 
 HADOOP-7139.patch

   Original Estimate: 2h
  Remaining Estimate: 2h



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Assigned: (HADOOP-7139) Allow appending to existing SequenceFiles

2011-03-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reassigned HADOOP-7139:
-

Assignee: Stephen Rose  (was: Mac Yang)

Sorry, I made a mistake assigning this a moment ago when marking it as open 
(while Todd's feedback is addressed).

 Allow appending to existing SequenceFiles
 -

 Key: HADOOP-7139
 URL: https://issues.apache.org/jira/browse/HADOOP-7139
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.20.1
Reporter: Stephen Rose
Assignee: Stephen Rose
Priority: Minor
 Fix For: 0.23.0

 Attachments: HADOOP-7139.patch, HADOOP-7139.patch, HADOOP-7139.patch, 
 HADOOP-7139.patch

   Original Estimate: 2h
  Remaining Estimate: 2h



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7098) tasktracker property not set in conf/hadoop-env.sh

2011-03-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7098:
--

Attachment: HADOOP-7098.patch

Good to know.

Regenerating patch from top level so Hudson can check it.

 tasktracker property not set in conf/hadoop-env.sh
 --

 Key: HADOOP-7098
 URL: https://issues.apache.org/jira/browse/HADOOP-7098
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.20.2, 0.21.1, 0.22.0
Reporter: Bernd Fondermann
Assignee: Bernd Fondermann
 Attachments: HADOOP-7098.patch, hadoop-7098.patch


 For all cluster components, except TaskTracker the OPTS environment variable 
 is set like this in hadoop-env.sh:
 export HADOOP_COMPONENT_OPTS=-Dcom.sun.management.jmxremote 
 $HADOOP_COMPONENT_OPTS
 The provided patch fixes this.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7098) tasktracker property not set in conf/hadoop-env.sh

2011-03-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7098:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Bernd!

 tasktracker property not set in conf/hadoop-env.sh
 --

 Key: HADOOP-7098
 URL: https://issues.apache.org/jira/browse/HADOOP-7098
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.20.2, 0.21.1, 0.22.0
Reporter: Bernd Fondermann
Assignee: Bernd Fondermann
 Fix For: 0.23.0

 Attachments: HADOOP-7098.patch, hadoop-7098.patch


 For all cluster components, except TaskTracker the OPTS environment variable 
 is set like this in hadoop-env.sh:
 export HADOOP_COMPONENT_OPTS=-Dcom.sun.management.jmxremote 
 $HADOOP_COMPONENT_OPTS
 The provided patch fixes this.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6541) An Interactive Hadoop FS shell

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6541:
--

Status: Open  (was: Patch Available)

 An Interactive Hadoop FS shell
 --

 Key: HADOOP-6541
 URL: https://issues.apache.org/jira/browse/HADOOP-6541
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: HADOOP-6541.2.patch, HADOOP-6541.3.patch, 
 HADOOP-6541.4.patch, HADOOP-6541.patch


 A shell that allows the user to execute multiple filesystem operations in a 
 single JVM instance at a prompt.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7114) FsShell should dump all exceptions at DEBUG level

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7114:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1 I've just committed this. Thanks Todd!

 FsShell should dump all exceptions at DEBUG level
 -

 Key: HADOOP-7114
 URL: https://issues.apache.org/jira/browse/HADOOP-7114
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.23.0

 Attachments: hadoop-7114.txt


 Most of the FsShell commands catch exceptions and then just print out an 
 error like foo:  + e.getLocalizedMessage(). This is fine when the exception 
 is user-facing (eg permissions errors) but in the case of a user hitting a 
 bug you get a useless error message with no stack trace. For example, 
 something chmod: null in the case of a NullPointerException bug.
 It would help debug these cases for users and developers if we also logged 
 the exception with full trace at DEBUG level.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6974) Configurable header buffer size for Hadoop HTTP server

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6974:
--

Status: Open  (was: Patch Available)

Paul, unfortunately this no longer applies. Can you regenerate the patch and 
I'll commit it.

 Configurable header buffer size for Hadoop HTTP server
 --

 Key: HADOOP-6974
 URL: https://issues.apache.org/jira/browse/HADOOP-6974
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Paul Butler
Assignee: Paul Butler
 Attachments: HADOOP-6974.3.patch, HADOOP-6974.4.patch, 
 hadoop-6974.2.patch, hadoop-6974.patch


 This patch adds a configurable parameter dfs.http.header.buffer.size to 
 Hadoop which allows the buffer size to be configured from the xml 
 configuration.
 This fixes an issue that came up in an environment where the Hadoop servers 
 share a domain with other web applications that use domain cookies. The large 
 cookies overwhelmed Jetty's buffer which caused it to return a 413 error.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HADOOP-7098) tasktracker property not set in conf/hadoop-env.sh

2011-03-02 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001500#comment-13001500
 ] 

Tom White commented on HADOOP-7098:
---

+1 The lack of a HADOOP_TASKTRACKER_OPTS definition looks like an oversight 
(these variables were introduced in HADOOP-2551). HADOOP_TASKTRACKER_OPTS is 
already honoured by bin/mapred, so no changes are needed there.

Bernd, have you tested this manually? There's currently no easy way to write an 
automated test for this change.

 tasktracker property not set in conf/hadoop-env.sh
 --

 Key: HADOOP-7098
 URL: https://issues.apache.org/jira/browse/HADOOP-7098
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.20.2, 0.21.1, 0.22.0
Reporter: Bernd Fondermann
Assignee: Bernd Fondermann
 Attachments: hadoop-7098.patch


 For all cluster components, except TaskTracker the OPTS environment variable 
 is set like this in hadoop-env.sh:
 export HADOOP_COMPONENT_OPTS=-Dcom.sun.management.jmxremote 
 $HADOOP_COMPONENT_OPTS
 The provided patch fixes this.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6754) DefaultCodec.createOutputStream() leaks memory

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6754:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Aaron!

 DefaultCodec.createOutputStream() leaks memory
 --

 Key: HADOOP-6754
 URL: https://issues.apache.org/jira/browse/HADOOP-6754
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.23.0

 Attachments: CompressionBug.java, HADOOP-6754.2.patch, 
 HADOOP-6754.patch, HADOOP-6754.patch


 DefaultCodec.createOutputStream() creates a new Compressor instance in each 
 OutputStream. Even if the OutputStream is closed, this leaks memory.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7035) Document incompatible API changes between releases

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7035:
--

Fix Version/s: 0.22.0
 Assignee: Tom White

 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.22.0

 Attachments: jdiff-with-previous-release.sh, 
 jdiff-with-previous-release.sh


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6342) Create a script to squash a common, hdfs, and mapreduce tarball into a single hadoop tarball

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6342:
--

Status: Open  (was: Patch Available)

We can close this now as it is subsumed by HADOOP-6846 (see 
https://issues.apache.org/jira/browse/HADOOP-6846?focusedCommentId=13001571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13001571).

 Create a script to squash a common, hdfs, and mapreduce tarball into a single 
 hadoop tarball
 

 Key: HADOOP-6342
 URL: https://issues.apache.org/jira/browse/HADOOP-6342
 Project: Hadoop Common
  Issue Type: New Feature
  Components: build
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.1

 Attachments: HADOOP-6342.2.patch, HADOOP-6342.patch, h-6342.patch, 
 tar-munge, tar-munge


 It would be convenient for the transition if we had a script to take a set of 
 common, hdfs, and mapreduce tarballs and merge them into a single tarball. 
 This is intended just to help users who don't want to transition to split 
 projects for deployment immediately.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-6846) Scripts for building Hadoop 0.21.0 release

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6846:
--

Fix Version/s: 0.22.0

 Scripts for building Hadoop 0.21.0 release
 --

 Key: HADOOP-6846
 URL: https://issues.apache.org/jira/browse/HADOOP-6846
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.22.0

 Attachments: release-scripts.tar.gz




-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7030) new topology mapping implementations

2011-03-02 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7030:
--

Assignee: Patrick Angeles
  Status: Open  (was: Patch Available)

This looks like a useful addition. Here are my comments on the patch:

* Could you combine the two types of file, so that if there are three columns 
the first two are interpreted as a range, otherwise use the first as a single 
host. Or just support CIDR notation?
* Have you thought about InetAddress to avoid implementing IP address parsing 
logic? 
http://guava-libraries.googlecode.com/svn/tags/release08/javadoc/com/google/common/net/InetAddresses.html
 might be useful (there was talk of introducing Guava recently).
* RefreshableDNSToSwitchMapping isn't hooked up yet, so perhaps it should go in 
a follow on JIRA.
* The name TableMapping is a bit general. How about FileBasedMapping, or 
similar?
* The configuration keys should go in CommonConfigurationKeysPublic.
* Primes are not needed in hashCode implementations. For Ip4 
Arrays.hashCode(value) is sufficient.
* The tests swallow exceptions - there should at least be a comment saying that 
this is expected. Also, fail() with a message is preferable to 
assertTrue(false).
* The tests should be JUnit 4 style.

 new topology mapping implementations
 

 Key: HADOOP-7030
 URL: https://issues.apache.org/jira/browse/HADOOP-7030
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.21.0, 0.20.2, 0.20.1
Reporter: Patrick Angeles
Assignee: Patrick Angeles
 Attachments: HADOOP-7030-2.patch, HADOOP-7030.patch, topology.patch


 The default ScriptBasedMapping implementation of DNSToSwitchMapping for 
 determining cluster topology has some drawbacks. Principally, it forks to an 
 OS-specific script.
 This issue proposes two new Java implementations of DNSToSwitchMapping. 
 TableMapping reads a two column text file that maps an IP or hostname to a 
 rack ID. Ip4RangeMapping reads a three column text file where each line 
 represents a start and end IP range plus a rack ID.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7112) Issue a warning when GenericOptionsParser libjars are not on local filesystem

2011-02-28 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7112:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this.

 Issue a warning when GenericOptionsParser libjars are not on local filesystem
 -

 Key: HADOOP-7112
 URL: https://issues.apache.org/jira/browse/HADOOP-7112
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf, filecache
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0

 Attachments: HADOOP-7112.patch


 In GenericOptionsParser#getLibJars() any jars that are not local filesystem 
 paths are silently ignored. We should issue a warning for users.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HADOOP-7145) Configuration.getLocalPath should trim whitespace from the provided directories

2011-02-15 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994970#comment-12994970
 ] 

Tom White commented on HADOOP-7145:
---

+1

 Configuration.getLocalPath should trim whitespace from the provided 
 directories
 ---

 Key: HADOOP-7145
 URL: https://issues.apache.org/jira/browse/HADOOP-7145
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hadoop-7145.txt


 MR and HDFS use the Configuration.getTrimmedStrings API for local directory 
 lists, but in a few places also use Configuration.getLocalPath. The former 
 API trims whitespace around each entry in the list, but the latter doesn't. 
 This can cause some subtle problems - the latter API should be fixed to also 
 trim the directory names.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HADOOP-7140) IPC Reader threads do not stop when server stops

2011-02-14 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994489#comment-12994489
 ] 

Tom White commented on HADOOP-7140:
---

This looks fine, but I wonder if it would be simpler to act on the 
InterruptedException in Reader (rather than just log it as the code currently 
does) so the thread exits when the pool is shutdown?

Minor nit: add messages to the test assertions so that the case of zero running 
threads can be more easily distinguished for the before and after case.

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HADOOP-7140) IPC Reader threads do not stop when server stops

2011-02-14 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994630#comment-12994630
 ] 

Tom White commented on HADOOP-7140:
---

+1

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7048) Wrong description of Block-Compressed SequenceFile Format in SequenceFile's javadoc

2011-02-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7048:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
 Assignee: Jingguo Yao
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Jingguo Yao!

 Wrong description of Block-Compressed SequenceFile Format in SequenceFile's 
 javadoc
 ---

 Key: HADOOP-7048
 URL: https://issues.apache.org/jira/browse/HADOOP-7048
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.21.0
Reporter: Jingguo Yao
Assignee: Jingguo Yao
Priority: Minor
 Fix For: 0.23.0

 Attachments: HADOOP-7048.patch

   Original Estimate: 10m
  Remaining Estimate: 10m

 Here is the following description for Block-Compressed SequenceFile Format in 
 SequenceFile's javadoc:
  * li
  * Record iBlock/i
  *   ul
  * liCompressed key-lengths block-size/li
  * liCompressed key-lengths block/li
  * liCompressed keys block-size/li
  * liCompressed keys block/li
  * liCompressed value-lengths block-size/li
  * liCompressed value-lengths block/li
  * liCompressed values block-size/li
  * liCompressed values block/li
  *   /ul
  * /li
  * li
  * A sync-marker every few code100/code bytes or so.
  * /li
 This description misses Uncompressed record number in the block. And A 
 sync-marker every few code100/code bytes or so is not the case for 
 Block-Compressed SequenceFile Format. Correct description should be:
  * li
  * Record iBlock/i
  *   ul
  * liUncompressed record number in the block/li
  * liCompressed key-lengths block-size/li
  * liCompressed key-lengths block/li
  * liCompressed keys block-size/li
  * liCompressed keys block/li
  * liCompressed value-lengths block-size/li
  * liCompressed value-lengths block/li
  * liCompressed values block-size/li
  * liCompressed values block/li
  *   /ul
  * /li
  * li
  * A sync-marker every block.
  * /li

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HADOOP-7129) Typo in method name getProtocolSigature

2011-02-01 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989392#comment-12989392
 ] 

Tom White commented on HADOOP-7129:
---

+1

 Typo in method name getProtocolSigature
 ---

 Key: HADOOP-7129
 URL: https://issues.apache.org/jira/browse/HADOOP-7129
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Todd Lipcon
 Attachments: HADOOP-7129.txt


 HADOOP-6904 introduced a method ProtocolSignature#getProtocolSigature, which 
 obviously has a typo. Let's not maintain an API with a typo in it.
 Annoyingly this will require commits to all three subprojects to fix :(

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HADOOP-7035) Document incompatible API changes between releases

2011-01-28 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7035:
--

Attachment: jdiff-with-previous-release.sh

Alan, I've updated the docs at http://people.apache.org/~tomwhite/HADOOP-7035/ 
to include all, stable-incompatible, evolving-incompatible, and 
unstable-incompatible changes. I agree that we should publish something like 
this with releases.

Doug, This is harder to do since we have to backport the audience/stability 
annotations, and work around the project split. I did something like this for 
0.21 at http://people.apache.org/~tomwhite/HADOOP-6668/docs/jdiff/changes.html

 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
 Attachments: jdiff-with-previous-release.sh, 
 jdiff-with-previous-release.sh


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-7035) Document incompatible API changes between releases

2011-01-28 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988310#action_12988310
 ] 

Tom White commented on HADOOP-7035:
---

 I noticed it classifies a few changes as incompatible which I'd have 
 thought are compatible

It sometimes produces false positives (changes that are marked as incompatible, 
but in fact aren't), which is what these particular changes are. We might be 
able to improve the code to handle these cases.

 Would it be possible to also distinguish between source-compatible and 
 binary-compatible somehow?

This tool is for detecting source incompatible changes. For testing 
binary-compatibility we need another tool, like 
[Clirr|http://clirr.sourceforge.net/] (although I'm not sure how up to date 
this one is).

 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
 Attachments: jdiff-with-previous-release.sh, 
 jdiff-with-previous-release.sh


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-7035) Document incompatible API changes between releases

2011-01-26 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7035:
--

Attachment: jdiff-with-previous-release.sh

Here's a script to generate the incompatible API changes between 0.21.0 and the 
0.22 branch. It ignores elements that are marked as @LimitedPrivate or 
@Private, and those that are marked as @Evolving or @Unstable. Furthermore, it 
uses a modified version of JDiff that only highlights incompatible changes (see 
patch at 
http://sourceforge.net/tracker/?func=detailaid=2990626group_id=37160atid=419055).

The resulting output can be seen at 
http://people.apache.org/~tomwhite/HADOOP-7035/common/docs/jdiff/changes.html 
and 
http://people.apache.org/~tomwhite/HADOOP-7035/mapreduce/docs/jdiff/changes.html.
 (There's no HDFS output because its API is private.)

Should we incorporate this into the release process (so that all changes are 
accounted for)? Or have Hudson run it to detect incompatible changes introduced 
on a per-patch basis?


 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
 Attachments: jdiff-with-previous-release.sh


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7112) Issue a warning when GenericOptionsParser libjars are not on local filesystem

2011-01-20 Thread Tom White (JIRA)
Issue a warning when GenericOptionsParser libjars are not on local filesystem
-

 Key: HADOOP-7112
 URL: https://issues.apache.org/jira/browse/HADOOP-7112
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf, filecache
Reporter: Tom White
Assignee: Tom White
 Attachments: HADOOP-7112.patch

In GenericOptionsParser#getLibJars() any jars that are not local filesystem 
paths are silently ignored. We should issue a warning for users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-7112) Issue a warning when GenericOptionsParser libjars are not on local filesystem

2011-01-20 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7112:
--

Status: Patch Available  (was: Open)

 Issue a warning when GenericOptionsParser libjars are not on local filesystem
 -

 Key: HADOOP-7112
 URL: https://issues.apache.org/jira/browse/HADOOP-7112
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf, filecache
Reporter: Tom White
Assignee: Tom White
 Attachments: HADOOP-7112.patch


 In GenericOptionsParser#getLibJars() any jars that are not local filesystem 
 paths are silently ignored. We should issue a warning for users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-7112) Issue a warning when GenericOptionsParser libjars are not on local filesystem

2011-01-20 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-7112:
--

Attachment: HADOOP-7112.patch

Straightforward patch to add a warning message.

 Issue a warning when GenericOptionsParser libjars are not on local filesystem
 -

 Key: HADOOP-7112
 URL: https://issues.apache.org/jira/browse/HADOOP-7112
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf, filecache
Reporter: Tom White
Assignee: Tom White
 Attachments: HADOOP-7112.patch


 In GenericOptionsParser#getLibJars() any jars that are not local filesystem 
 paths are silently ignored. We should issue a warning for users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-7093) Servlets should default to text/plain

2011-01-11 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980349#action_12980349
 ] 

Tom White commented on HADOOP-7093:
---

A few comments:

* StackServlet shouldn't use HtmlQuoting since it is serving plain text.
* We need to be sure that StackServlet is serving UTF8-encoded text. Currently 
it is using the default platform encoding since it is using a writer 
constructed with new PrintWriter(response.getOutputStream()), see 
http://download.oracle.com/javase/6/docs/api/java/io/PrintWriter.html#PrintWriter%28java.io.OutputStream%29.
 Rather we might use response.getWriter(), which uses the character encoding 
returned by ServletResponse#getCharacterEncoding(), which should pick it up 
from our earlier call to ServletResponse#setContentType, according to 
http://download.oracle.com/javaee/6/api/javax/servlet/ServletResponse.html#getWriter%28%29.
 The other servlets need checking for this too.
* For JSON, MetricsServlet should set the content type to application/json; 
charset=utf-8. It's not currently setting the content type.
* ConfServlet should set the charset explicitly too.

 Servlets should default to text/plain
 -

 Key: HADOOP-7093
 URL: https://issues.apache.org/jira/browse/HADOOP-7093
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hadoop-7093.txt


 In trunk the servlets like /stacks and /metrics are returning text/html 
 content-type instead of text/plain. Security wise it's much safer to default 
 to text/plain and require servlets to explicitly set the content-type to 
 text/html when required.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-7093) Servlets should default to text/plain

2011-01-11 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980550#action_12980550
 ] 

Tom White commented on HADOOP-7093:
---

+1

Nit: with this patch there are two mimetypes for JSON; we should use 
application/json in both cases (http://tools.ietf.org/html/rfc4627).

 Servlets should default to text/plain
 -

 Key: HADOOP-7093
 URL: https://issues.apache.org/jira/browse/HADOOP-7093
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hadoop-7093.2.txt, hadoop-7093.txt


 In trunk the servlets like /stacks and /metrics are returning text/html 
 content-type instead of text/plain. Security wise it's much safer to default 
 to text/plain and require servlets to explicitly set the content-type to 
 text/html when required.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HADOOP-6811) Remove EC2 bash scripts

2011-01-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reassigned HADOOP-6811:
-

Assignee: Tom White

 Remove EC2 bash scripts
 ---

 Key: HADOOP-6811
 URL: https://issues.apache.org/jira/browse/HADOOP-6811
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.22.0


 The bash scripts are deprecated in 0.21 (HADOOP-6403) in favour of scripts in 
 Whirr (http://incubator.apache.org/projects/whirr.html). They should be 
 removed in 0.22. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6811) Remove EC2 bash scripts

2011-01-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6811:
--

Attachment: HADOOP-6811.patch

Simple patch to remove 'src/contrib/ec2'.

 Remove EC2 bash scripts
 ---

 Key: HADOOP-6811
 URL: https://issues.apache.org/jira/browse/HADOOP-6811
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HADOOP-6811.patch


 The bash scripts are deprecated in 0.21 (HADOOP-6403) in favour of scripts in 
 Whirr (http://incubator.apache.org/projects/whirr.html). They should be 
 removed in 0.22. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6811) Remove EC2 bash scripts

2011-01-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6811:
--

Status: Patch Available  (was: Open)

 Remove EC2 bash scripts
 ---

 Key: HADOOP-6811
 URL: https://issues.apache.org/jira/browse/HADOOP-6811
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HADOOP-6811.patch


 The bash scripts are deprecated in 0.21 (HADOOP-6403) in favour of scripts in 
 Whirr (http://incubator.apache.org/projects/whirr.html). They should be 
 removed in 0.22. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-938) too many SequenceFile.createWriter() methods

2011-01-04 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-938.
--

Resolution: Duplicate

Fixed in HADOOP-6856.

 too many SequenceFile.createWriter() methods
 

 Key: HADOOP-938
 URL: https://issues.apache.org/jira/browse/HADOOP-938
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Reporter: Doug Cutting

 There are too many SequenceFile.createWriter() method signatures.  This 
 method has two required paramters: a Configuration and a Path.  It has one 
 obsolete parameter: a FileSystem.  And it has five optional parameters: 
 CompressionType, CompressionCodec, Progress, replication, and metadata.
 We should remove the obsolete parameter and make all optional parameters into 
 setters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7077) Use Java's ServiceLoader to add default resources to Configuration

2010-12-23 Thread Tom White (JIRA)
Use Java's ServiceLoader to add default resources to Configuration
--

 Key: HADOOP-7077
 URL: https://issues.apache.org/jira/browse/HADOOP-7077
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: Tom White


Currently each class with a main() method (in HDFS and MapReduce) calls 
Configuration.addDefaultResource() to add the names of resource files to load 
(e.g. see DataNode and NameNode which both add hdfs-default.xml and 
hdfs-site.xml). We could reduce the code duplication by allowing the use of 
[java.util.ServiceLoader|http://download.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html]
 to do the initialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-7077) Use Java's ServiceLoader to add default resources to Configuration

2010-12-23 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974674#action_12974674
 ] 

Tom White commented on HADOOP-7077:
---

To do this we would add an interface called ConfigurationResourceLoader or 
somesuch with a single loadResources() method. The static initializer in 
Configuration would use ServiceLoader to load the implementations of 
ConfigurationResourceLoader and call loadResources() on each of them. Then in 
HDFS we would have an implementation of ConfigurationResourceLoader that adds 
hdfs-default.xml and hdfs-site.xml. (And similarly for MapReduce.)

 Use Java's ServiceLoader to add default resources to Configuration
 --

 Key: HADOOP-7077
 URL: https://issues.apache.org/jira/browse/HADOOP-7077
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: Tom White

 Currently each class with a main() method (in HDFS and MapReduce) calls 
 Configuration.addDefaultResource() to add the names of resource files to load 
 (e.g. see DataNode and NameNode which both add hdfs-default.xml and 
 hdfs-site.xml). We could reduce the code duplication by allowing the use of 
 [java.util.ServiceLoader|http://download.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html]
  to do the initialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6298) BytesWritable#getBytes is a bad name that leads to programming mistakes

2010-12-16 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972097#action_12972097
 ] 

Tom White commented on HADOOP-6298:
---

+1 This looks good to me.

Minor nit/improvement: the javadoc references to {{copyBytes()}} could be 
links, also the javadoc for the new method could refer to {{getBytes()}}, using 
@see or a link.

 BytesWritable#getBytes is a bad name that leads to programming mistakes
 ---

 Key: HADOOP-6298
 URL: https://issues.apache.org/jira/browse/HADOOP-6298
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.20.1
Reporter: Nathan Marz
Assignee: Owen O'Malley
 Fix For: 0.22.0

 Attachments: h-6298.patch


 Pretty much everyone at Rapleaf who has worked with Hadoop has misused 
 BytesWritable#getBytes at some point, not expecting the byte array to be 
 padded. I think we can completely alleviate these programming mistakes by 
 deprecating and renaming this method (again) to be more descriptive. I 
 propose getPaddedBytes() or getPaddedValue(). It would also be helpful to 
 have a helper method getNonPaddedValue() that makes a copy into a 
 non-padded byte array. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6298) BytesWritable#getBytes is a bad name that leads to programming mistakes

2010-12-13 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971040#action_12971040
 ] 

Tom White commented on HADOOP-6298:
---

How about adding {{getBytesNonPadded()}} which creates a copy in a non-padded 
byte array? By naming it like this, this name would appear next to 
{{getBytes()}} in IDE autocompletion lists, which hopefully would alert users 
to the difference between the two methods.

 BytesWritable#getBytes is a bad name that leads to programming mistakes
 ---

 Key: HADOOP-6298
 URL: https://issues.apache.org/jira/browse/HADOOP-6298
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.20.1
Reporter: Nathan Marz

 Pretty much everyone at Rapleaf who has worked with Hadoop has misused 
 BytesWritable#getBytes at some point, not expecting the byte array to be 
 padded. I think we can completely alleviate these programming mistakes by 
 deprecating and renaming this method (again) to be more descriptive. I 
 propose getPaddedBytes() or getPaddedValue(). It would also be helpful to 
 have a helper method getNonPaddedValue() that makes a copy into a 
 non-padded byte array. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-12-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

   Resolution: Fixed
Fix Version/s: 0.23.0
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Todd!

(I re-ran the tests that failed for hudson and they all passed for me.)

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.23.0

 Attachments: HADOOP-6939.patch, hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of MapString,String for configuration

2010-11-29 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12964847#action_12964847
 ] 

Tom White commented on HADOOP-6685:
---

Owen All of the technical feedback for this patch has been addressed, 
including technical feedback that Tom gave me offline.

Was this the feedback I gave on MAPREDUCE-1462 back in February? I haven't 
given any feedback offline for this issue.

Owen Making compromises to not change SequenceFile or moving the plugins to 
the contrib module that would destroy the usability of the patch, isn't really 
compromise. They are just thinly veiled attempts to make it difficult to use 
ProtoBufs or Thrift in user's applications.

The original work for Thrift and Protocol Buffers serializations (MAPREDUCE-376 
and MAPREDUCE-377) was as contrib modules, so if we want to change that 
approach, then we need to get consensus on doing so. That consensus hasn't been 
forthcoming so they should be left as optional contrib modules. Serializations 
in this form are easy to use by the way: users just add the relevant contrib 
jar and the serialization jar to the job, just like any other dependency.



 Change the generic serialization framework API to use serialization-specific 
 bytes instead of MapString,String for configuration
 --

 Key: HADOOP-6685
 URL: https://issues.apache.org/jira/browse/HADOOP-6685
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0

 Attachments: libthrift.jar, serial.patch, serial4.patch, 
 serial6.patch, serial7.patch, SerializationAtSummit.pdf


 Currently, the generic serialization framework uses MapString,String for 
 the serialization specific configuration. Since this data is really internal 
 to the specific serialization, I think we should change it to be an opaque 
 binary blob. This will simplify the interface for defining specific 
 serializations for different contexts (MAPREDUCE-1462). It will also move us 
 toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-29 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

Attachment: HADOOP-6939.patch

Fixed patch.

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: HADOOP-6939.patch, hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-29 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

Status: Patch Available  (was: Open)

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: HADOOP-6939.patch, hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of MapString,String for configuration

2010-11-22 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934566#action_12934566
 ] 

Tom White commented on HADOOP-6685:
---

I have two serious issues with the current patch, which I have mentioned above. 
However, given that they have not been adequately addressed I feel I have no 
option but to vote -1.

The first is that no change is needed in SequenceFile unless we want to support 
Avro, but, given that Avro data files were designed for this, and are 
multi-lingual, why change the SequenceFile format solely to support Avro? Are 
Avro data files insufficient? Note that Thrift and Protocol Buffers can be 
stored in today's SequenceFiles.

The second is that this patch adds new serializations which introduce into the 
core a new dependency on a particular version of each of Avro, Thrift, and PB, 
in a non-pluggable way.

This type of dependency is qualitatively different to other dependencies. 
Hadoop depends on log4j for instance, so if a user's code does too, then it 
needs to use the same version. A recent JIRA made it possible to specify a 
different version of log4j in the job, but this only works if the version the 
user specifies is compatible with *both* their code and the Hadoop kernel code.

However, in the case of a PB serialization, for example, the PB library is not 
used in Hadoop except in the serialization code for serializing the user's data 
type. So it's a user-level concern, and should be compiled as such - putting it 
in core Hadoop is asking for trouble in the future, since the Hadoop releases 
won't keep track with the union of PB, Thrift, and Avro releases. These 
serialization plugins should be stand alone, or at least easily re-compilable 
in a way that doesn't involve recompiling all of Hadoop, such as a contrib 
module. The user just treats the plugin JAR as another code dependency.

To move forward on this issue it's clear that compromise is needed. I actually 
prefer strings in serialization (HADOOP-6420), but am prepared to compromise 
over it, in the interests of finding consensus.


 Change the generic serialization framework API to use serialization-specific 
 bytes instead of MapString,String for configuration
 --

 Key: HADOOP-6685
 URL: https://issues.apache.org/jira/browse/HADOOP-6685
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0

 Attachments: libthrift.jar, serial.patch, serial4.patch, 
 serial6.patch, serial7.patch, SerializationAtSummit.pdf


 Currently, the generic serialization framework uses MapString,String for 
 the serialization specific configuration. Since this data is really internal 
 to the specific serialization, I think we should change it to be an opaque 
 binary blob. This will simplify the interface for defining specific 
 serializations for different contexts (MAPREDUCE-1462). It will also move us 
 toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of MapString,String for configuration

2010-11-16 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932396#action_12932396
 ] 

Tom White commented on HADOOP-6685:
---

Here's my feedback on the patch:

# I think the new serializations should be optional dependencies. Mandating a 
particular version of Thrift, Protocol Buffers, and Avro is going to cause 
problems for folks down the line, since we would be tying the version to 
Hadoop's release cycle, which is infrequent. By making the serializations 
libraries (or contrib modules, as in MAPREDUCE-376, MAPREDUCE-377) makes them 
independent, and will make it easier to support the version of the 
serialization library the user wants.
# I preferred the version where the Serialization could choose the way it 
serialized itself. In the current patch, if you wrote Avro data in a 
SequenceFile you would have Writables for the file container, and a PB-encoded 
Avro schema for the serialization. Having so many serialization mechanisms is 
potentially brittle.
# I'm not sure we need the full generally of PB for serializing serializations. 
If the serialization could choose its self-serialization mechanism, then 
TypedSerialization could just write its type using as a Text object. Doing this 
would remove the core dependency on PB, and allow 1.


 Change the generic serialization framework API to use serialization-specific 
 bytes instead of MapString,String for configuration
 --

 Key: HADOOP-6685
 URL: https://issues.apache.org/jira/browse/HADOOP-6685
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0

 Attachments: libthrift.jar, serial.patch, serial4.patch, 
 serial6.patch, SerializationAtSummit.pdf


 Currently, the generic serialization framework uses MapString,String for 
 the serialization specific configuration. Since this data is really internal 
 to the specific serialization, I think we should change it to be an opaque 
 binary blob. This will simplify the interface for defining specific 
 serializations for different contexts (MAPREDUCE-1462). It will also move us 
 toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of MapString,String for configuration

2010-11-16 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932727#action_12932727
 ] 

Tom White commented on HADOOP-6685:
---

bq. Avro is already a dependency. Thrift is already a dependency for HDFS (see 
HDFS-1484).

There's a difference. Avro is used to implement an internal Hadoop format, and 
Thrift, which is used in the thriftfs contrib module is also an internal 
detail. So users don't care about the version numbers for these libraries. 
(Also, the latter is a contrib module, so users can elect not to include it.) 
The case I'm thinking about here is if a user has a Thrift file definition that 
uses a feature of a later version of Thrift than is included in Hadoop then 
they can't use it. If we make it a library then it becomes possible to update 
the library independently.

 Change the generic serialization framework API to use serialization-specific 
 bytes instead of MapString,String for configuration
 --

 Key: HADOOP-6685
 URL: https://issues.apache.org/jira/browse/HADOOP-6685
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0

 Attachments: libthrift.jar, serial.patch, serial4.patch, 
 serial6.patch, SerializationAtSummit.pdf


 Currently, the generic serialization framework uses MapString,String for 
 the serialization specific configuration. Since this data is really internal 
 to the specific serialization, I think we should change it to be an opaque 
 binary blob. This will simplify the interface for defining specific 
 serializations for different contexts (MAPREDUCE-1462). It will also move us 
 toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6496) HttpServer sends wrong content-type for CSS files (and others)

2010-11-15 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6496:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Todd!

 HttpServer sends wrong content-type for CSS files (and others)
 --

 Key: HADOOP-6496
 URL: https://issues.apache.org/jira/browse/HADOOP-6496
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0, 0.22.0
Reporter: Lars Francke
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: hadoop-6496.txt, hadoop-6496.txt


 CSS files are send as text/html causing problems if the HTML page is rendered 
 in standards mode. The HDFS interface for example still works because it is 
 rendered in quirks mode, the HBase interface doesn't work because it is 
 rendered in standards mode. See HBASE-2110 for more details.
 I've had a quick look at HttpServer but I'm too unfamiliar with it to see the 
 problem. I think this started happening with HADOOP-6441 which would lead me 
 to believe that the filter is called for every request and not only *.jsp and 
 *.html. I'd consider this a bug but I don't know enough about this to provide 
 a fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7035) Generate incompatible API changes between releases

2010-11-15 Thread Tom White (JIRA)
Generate incompatible API changes between releases
--

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
 Fix For: 0.22.0


We can use JDiff to generate a list of incompatible changes for each release. 
See 
https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-7035) Document incompatible API changes between releases

2010-11-15 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932197#action_12932197
 ] 

Tom White commented on HADOOP-7035:
---

Good point - thanks for re-wording it!

 Document incompatible API changes between releases
 --

 Key: HADOOP-7035
 URL: https://issues.apache.org/jira/browse/HADOOP-7035
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Tom White
 Fix For: 0.22.0


 We can use JDiff to generate a list of incompatible changes for each release. 
 See 
 https://issues.apache.org/jira/browse/HADOOP-6668?focusedCommentId=12860017page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12860017

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of MapString,String for configuration

2010-11-12 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931517#action_12931517
 ] 

Tom White commented on HADOOP-6685:
---

 If one's records don't implement the {{Writable}} interface, then there's no 
 reasonable binary container in Hadoop.

SequenceFile supports non-Writable types already. The limitation today is that 
there must be a one-to-one mapping between Java class and serialized data type. 
I think that can be satisfied by both Thrift and Protocol Buffers. For Avro, I 
don't think we want to support it in SequenceFile, as we should instead 
encourage use of Avro Data File, which is like SequenceFile but interoperable 
with other languages.

A process question: given how we failed to gain consensus last time, what could 
we do differently this time round? A design document to motivate the use cases? 
Any other suggestions?

 Change the generic serialization framework API to use serialization-specific 
 bytes instead of MapString,String for configuration
 --

 Key: HADOOP-6685
 URL: https://issues.apache.org/jira/browse/HADOOP-6685
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: serial.patch


 Currently, the generic serialization framework uses MapString,String for 
 the serialization specific configuration. Since this data is really internal 
 to the specific serialization, I think we should change it to be an opaque 
 binary blob. This will simplify the interface for defining specific 
 serializations for different contexts (MAPREDUCE-1462). It will also move us 
 toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6974) Configurable header buffer size for Hadoop HTTP server

2010-11-10 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930791#action_12930791
 ] 

Tom White commented on HADOOP-6974:
---

With git you need to use --no-prefix so that Hudson can apply the patch.

 I haven't added tests for larger headers

If you've manually tested the patch with larger headers, then that's OK too.

 Configurable header buffer size for Hadoop HTTP server
 --

 Key: HADOOP-6974
 URL: https://issues.apache.org/jira/browse/HADOOP-6974
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Paul Butler
Assignee: Paul Butler
 Attachments: hadoop-6974.2.patch, HADOOP-6974.3.patch, 
 hadoop-6974.patch


 This patch adds a configurable parameter dfs.http.header.buffer.size to 
 Hadoop which allows the buffer size to be configured from the xml 
 configuration.
 This fixes an issue that came up in an environment where the Hadoop servers 
 share a domain with other web applications that use domain cookies. The large 
 cookies overwhelmed Jetty's buffer which caused it to return a 413 error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6496) HttpServer sends wrong content-type for CSS files (and others)

2010-11-10 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930803#action_12930803
 ] 

Tom White commented on HADOOP-6496:
---

+1 looks good.

 HttpServer sends wrong content-type for CSS files (and others)
 --

 Key: HADOOP-6496
 URL: https://issues.apache.org/jira/browse/HADOOP-6496
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0, 0.22.0
Reporter: Lars Francke
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-6496.txt, hadoop-6496.txt


 CSS files are send as text/html causing problems if the HTML page is rendered 
 in standards mode. The HDFS interface for example still works because it is 
 rendered in quirks mode, the HBase interface doesn't work because it is 
 rendered in standards mode. See HBASE-2110 for more details.
 I've had a quick look at HttpServer but I'm too unfamiliar with it to see the 
 problem. I think this started happening with HADOOP-6441 which would lead me 
 to believe that the filter is called for every request and not only *.jsp and 
 *.html. I'd consider this a bug but I don't know enough about this to provide 
 a fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

Status: Open  (was: Patch Available)

This patch doesn't compile, can you regenerate it please?

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-4675) Current Ganglia metrics implementation is incompatible with Ganglia 3.1

2010-11-10 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-4675:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Brian and everyone who helped out!

 Current Ganglia metrics implementation is incompatible with Ganglia 3.1
 ---

 Key: HADOOP-4675
 URL: https://issues.apache.org/jira/browse/HADOOP-4675
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.21.0
Reporter: Brian Bockelman
Assignee: Brian Bockelman
 Fix For: 0.22.0

 Attachments: hadoop-4675-2.patch, hadoop-4675-3.patch, 
 HADOOP-4675-4.patch, HADOOP-4675-v5.patch, HADOOP-4675-v6.patch, 
 HADOOP-4675-v7.patch, HADOOP-4675-v8.patch, HADOOP-4675-v9.patch, 
 HADOOP-4675.patch, hadoop-4675.patch


 Ganglia changed its wire protocol in the 3.1.x series; the current 
 implementation only works for 3.0.x.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-09 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930476#action_12930476
 ] 

Tom White commented on HADOOP-6939:
---

+1

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-09 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

Status: Open  (was: Patch Available)

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6939) Inconsistent lock ordering in AbstractDelegationTokenSecretManager

2010-11-09 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6939:
--

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

Re-running through hudson.

 Inconsistent lock ordering in AbstractDelegationTokenSecretManager
 --

 Key: HADOOP-6939
 URL: https://issues.apache.org/jira/browse/HADOOP-6939
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-6939.txt, lockorder.png


 AbstractDelegationTokenSecretManager.startThreads() is synchronized, which 
 calls updateCurrentKey(), which calls logUpdateMasterKey. 
 logUpdateMasterKey's implementation for HDFS's manager calls 
 namesystem.logUpdateMasterKey() which is synchronized. Thus the lock order is 
 ADTSM - FSN. In FSN.saveNamespace, though, it calls 
 DTSM.saveSecretManagerState(), so the lock order is FSN - ADTSM.
 I don't think this deadlock occurs in practice since saveNamespace won't 
 occur until after the ADTSM has started its threads, but should be fixed 
 anyway.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-4675) Current Ganglia metrics implementation is incompatible with Ganglia 3.1

2010-11-09 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-4675:
--

Attachment: HADOOP-4675.patch

I've regenerated the patch without the test, to be added in HDFS-1493. If there 
are no objections to the approach I'd like to commit this, since the fix is 
long overdue.

 Current Ganglia metrics implementation is incompatible with Ganglia 3.1
 ---

 Key: HADOOP-4675
 URL: https://issues.apache.org/jira/browse/HADOOP-4675
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.21.0
Reporter: Brian Bockelman
Assignee: Brian Bockelman
 Attachments: hadoop-4675-2.patch, hadoop-4675-3.patch, 
 HADOOP-4675-4.patch, HADOOP-4675-v5.patch, HADOOP-4675-v6.patch, 
 HADOOP-4675-v7.patch, HADOOP-4675-v8.patch, HADOOP-4675-v9.patch, 
 HADOOP-4675.patch, hadoop-4675.patch


 Ganglia changed its wire protocol in the 3.1.x series; the current 
 implementation only works for 3.0.x.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-4675) Current Ganglia metrics implementation is incompatible with Ganglia 3.1

2010-11-09 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-4675:
--

Status: Patch Available  (was: Open)

 Current Ganglia metrics implementation is incompatible with Ganglia 3.1
 ---

 Key: HADOOP-4675
 URL: https://issues.apache.org/jira/browse/HADOOP-4675
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.21.0
Reporter: Brian Bockelman
Assignee: Brian Bockelman
 Attachments: hadoop-4675-2.patch, hadoop-4675-3.patch, 
 HADOOP-4675-4.patch, HADOOP-4675-v5.patch, HADOOP-4675-v6.patch, 
 HADOOP-4675-v7.patch, HADOOP-4675-v8.patch, HADOOP-4675-v9.patch, 
 HADOOP-4675.patch, hadoop-4675.patch


 Ganglia changed its wire protocol in the 3.1.x series; the current 
 implementation only works for 3.0.x.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6754) DefaultCodec.createOutputStream() leaks memory

2010-11-08 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6754:
--

Attachment: HADOOP-6754.patch

+1 This would be good to add. I've regenerated the patch.

 DefaultCodec.createOutputStream() leaks memory
 --

 Key: HADOOP-6754
 URL: https://issues.apache.org/jira/browse/HADOOP-6754
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: CompressionBug.java, HADOOP-6754.2.patch, 
 HADOOP-6754.patch, HADOOP-6754.patch


 DefaultCodec.createOutputStream() creates a new Compressor instance in each 
 OutputStream. Even if the OutputStream is closed, this leaks memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6754) DefaultCodec.createOutputStream() leaks memory

2010-11-08 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6754:
--

Status: Open  (was: Patch Available)

 DefaultCodec.createOutputStream() leaks memory
 --

 Key: HADOOP-6754
 URL: https://issues.apache.org/jira/browse/HADOOP-6754
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: CompressionBug.java, HADOOP-6754.2.patch, 
 HADOOP-6754.patch, HADOOP-6754.patch


 DefaultCodec.createOutputStream() creates a new Compressor instance in each 
 OutputStream. Even if the OutputStream is closed, this leaks memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6754) DefaultCodec.createOutputStream() leaks memory

2010-11-08 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6754:
--

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

 DefaultCodec.createOutputStream() leaks memory
 --

 Key: HADOOP-6754
 URL: https://issues.apache.org/jira/browse/HADOOP-6754
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: CompressionBug.java, HADOOP-6754.2.patch, 
 HADOOP-6754.patch, HADOOP-6754.patch


 DefaultCodec.createOutputStream() creates a new Compressor instance in each 
 OutputStream. Even if the OutputStream is closed, this leaks memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6974) Configurable header buffer size for Hadoop HTTP server

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6974:
--

Assignee: Paul Butler
  Status: Open  (was: Patch Available)

Marking as open while feedback is addressed. Also this patch needs to be 
against trunk so Hudson can test it - it looks like it's for 0.20.

 Configurable header buffer size for Hadoop HTTP server
 --

 Key: HADOOP-6974
 URL: https://issues.apache.org/jira/browse/HADOOP-6974
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Paul Butler
Assignee: Paul Butler
 Attachments: hadoop-6974.2.patch, hadoop-6974.patch


 This patch adds a configurable parameter dfs.http.header.buffer.size to 
 Hadoop which allows the buffer size to be configured from the xml 
 configuration.
 This fixes an issue that came up in an environment where the Hadoop servers 
 share a domain with other web applications that use domain cookies. The large 
 cookies overwhelmed Jetty's buffer which caused it to return a 413 error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6758) MapFile.fix does not allow index interval definition

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6758:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Assignee: Gianmarco De Francisci Morales
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Gianmarco!

(In the version I committed I inlined the assignment and referenced 
Writer.INDEX_INTERVAL rather than the string constant.) 

 MapFile.fix does not allow index interval definition
 

 Key: HADOOP-6758
 URL: https://issues.apache.org/jira/browse/HADOOP-6758
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.1, 0.20.2
Reporter: Gianmarco De Francisci Morales
Assignee: Gianmarco De Francisci Morales
 Fix For: 0.22.0

 Attachments: HADOOP-6758.patch, HADOOP-6758.patch, HADOOP-6758.patch


 When using the static methond MapFile.fix() there is no way to override the 
 default IndexInterval that is 128.
 The IndexInterval should be taken from the configuration that is passed to 
 the method.
 {code}
 int indexInterval = 128; 
 indexInterval = conf.getInt(INDEX_INTERVAL, indexInterval); 
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6304) Use java.io.File.set{Readable|Writable|Executable} where possible in RawLocalFileSystem

2010-11-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928849#action_12928849
 ] 

Tom White commented on HADOOP-6304:
---

Arun - it would be good to get this into trunk for the 0.22 release.

 Use java.io.File.set{Readable|Writable|Executable} where possible in 
 RawLocalFileSystem 
 

 Key: HADOOP-6304
 URL: https://issues.apache.org/jira/browse/HADOOP-6304
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.20.1
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: HADOOP-6304_yhadoop20.patch


 Using java.io.File.set{Readable|Writable|Executable} where possible in 
 RawLocalFileSystem when g  o perms are same saves a lot of 'fork' 
 system-calls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6776) UserGroupInformation.createProxyUser's javadoc is broken

2010-11-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928857#action_12928857
 ] 

Tom White commented on HADOOP-6776:
---

This is fixed in trunk, so can be closed.

 UserGroupInformation.createProxyUser's javadoc is broken
 

 Key: HADOOP-6776
 URL: https://issues.apache.org/jira/browse/HADOOP-6776
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.22.0

 Attachments: 6776.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6776) UserGroupInformation.createProxyUser's javadoc is broken

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-6776.
---

Resolution: Duplicate

 UserGroupInformation.createProxyUser's javadoc is broken
 

 Key: HADOOP-6776
 URL: https://issues.apache.org/jira/browse/HADOOP-6776
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.22.0

 Attachments: 6776.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6926) SocketInputStream incorrectly implements read()

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6926:
--

Attachment: HADOOP-6926.patch

+1

The javac compilation error was due to:

{noformat}
 [javac] 
/Users/tom/workspace/hadoop-common-trunk/src/test/core/org/apache/hadoop/net/TestSocketIOWithTimeout.java:112:
 warning: [cast] redundant cast to int
 [javac]   assertEquals((int)(byteWithHighBit  0xff), in.read());
{noformat}

Here's a new patch with the cast removed.

 SocketInputStream incorrectly implements read()
 ---

 Key: HADOOP-6926
 URL: https://issues.apache.org/jira/browse/HADOOP-6926
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: HADOOP-6926.patch, hadoop-6926.txt


 SocketInputStream's read() implementation doesn't upcast to int correctly, so 
 it can't read bytes  0x80. This is the same bug as HADOOP-6925, but in a 
 different spot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6926) SocketInputStream incorrectly implements read()

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6926:
--

Status: Open  (was: Patch Available)

 SocketInputStream incorrectly implements read()
 ---

 Key: HADOOP-6926
 URL: https://issues.apache.org/jira/browse/HADOOP-6926
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.21.0, 0.20.2, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: HADOOP-6926.patch, hadoop-6926.txt


 SocketInputStream's read() implementation doesn't upcast to int correctly, so 
 it can't read bytes  0x80. This is the same bug as HADOOP-6925, but in a 
 different spot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6926) SocketInputStream incorrectly implements read()

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6926:
--

Fix Version/s: 0.22.0
   Status: Patch Available  (was: Open)

 SocketInputStream incorrectly implements read()
 ---

 Key: HADOOP-6926
 URL: https://issues.apache.org/jira/browse/HADOOP-6926
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.21.0, 0.20.2, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: HADOOP-6926.patch, hadoop-6926.txt


 SocketInputStream's read() implementation doesn't upcast to int correctly, so 
 it can't read bytes  0x80. This is the same bug as HADOOP-6925, but in a 
 different spot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6926) SocketInputStream incorrectly implements read()

2010-11-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6926:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Todd!

 SocketInputStream incorrectly implements read()
 ---

 Key: HADOOP-6926
 URL: https://issues.apache.org/jira/browse/HADOOP-6926
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: HADOOP-6926.patch, hadoop-6926.txt


 SocketInputStream's read() implementation doesn't upcast to int correctly, so 
 it can't read bytes  0x80. This is the same bug as HADOOP-6925, but in a 
 different spot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6917) Remove empty FTPFileSystemConfigKeys.java file

2010-11-03 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-6917.
---

Resolution: Duplicate

Fixed in HADOOP-6818

 Remove empty FTPFileSystemConfigKeys.java file
 --

 Key: HADOOP-6917
 URL: https://issues.apache.org/jira/browse/HADOOP-6917
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White

 FTPFileSystemConfigKeys.java is empty and not used so should be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6974) Configurable header buffer size for Hadoop HTTP server

2010-11-02 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927622#action_12927622
 ] 

Tom White commented on HADOOP-6974:
---

This looks good. Do we need a test that checks that the request fails for a 
cookie that is bigger than the header buffer?  

 Configurable header buffer size for Hadoop HTTP server
 --

 Key: HADOOP-6974
 URL: https://issues.apache.org/jira/browse/HADOOP-6974
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Paul Butler
 Attachments: hadoop-6974.2.patch, hadoop-6974.patch


 This patch adds a configurable parameter dfs.http.header.buffer.size to 
 Hadoop which allows the buffer size to be configured from the xml 
 configuration.
 This fixes an issue that came up in an environment where the Hadoop servers 
 share a domain with other web applications that use domain cookies. The large 
 cookies overwhelmed Jetty's buffer which caused it to return a 413 error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6975) integer overflow in S3InputStream for blocks 2GB

2010-11-01 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927182#action_12927182
 ] 

Tom White commented on HADOOP-6975:
---

I ran the unit tests and test-patch.

{noformat}
 [exec] -1 overall. 
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
 [exec]
 [exec] 
{noformat}

 integer overflow in S3InputStream for blocks  2GB
 --

 Key: HADOOP-6975
 URL: https://issues.apache.org/jira/browse/HADOOP-6975
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Patrick Kling
 Attachments: HADOOP-6975.patch


 S3InputStream has the same integer overflow issue as DFSInputStream (fixed in 
 HDFS-96).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6975) integer overflow in S3InputStream for blocks 2GB

2010-11-01 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6975:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Assignee: Patrick Kling
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Patrick!

 integer overflow in S3InputStream for blocks  2GB
 --

 Key: HADOOP-6975
 URL: https://issues.apache.org/jira/browse/HADOOP-6975
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Patrick Kling
Assignee: Patrick Kling
 Fix For: 0.22.0

 Attachments: HADOOP-6975.patch


 S3InputStream has the same integer overflow issue as DFSInputStream (fixed in 
 HDFS-96).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-28 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925875#action_12925875
 ] 

Tom White commented on HADOOP-6663:
---

I ran the tests and test-patch manually:

{noformat}
 [exec] +1 overall. 
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
{noformat}

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch, 
 HADOOP-6663.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-28 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6663:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I've just committed this.

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch, 
 HADOOP-6663.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6947) Kerberos relogin should set refreshKrb5Config to true

2010-10-26 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6947:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Todd!

 Kerberos relogin should set refreshKrb5Config to true
 -

 Key: HADOOP-6947
 URL: https://issues.apache.org/jira/browse/HADOOP-6947
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hadoop-6947-branch20.txt, hadoop-6947.txt


 In working on securing a daemon that uses two different principals from 
 different threads, I found that I wasn't able to login from a second keytab 
 after I'd logged in from the first. This is because we don't set the 
 refreshKrb5Config in the Configuration for the Krb5LoginModule - hence it 
 won't switch over to the correct keytab file if it's different than the first.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6954) Sources JARs are not correctly published to the Maven repository

2010-10-26 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925042#action_12925042
 ] 

Tom White commented on HADOOP-6954:
---

Here are the results of running test-patch:

{noformat}
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
{noformat}

 Sources JARs are not correctly published to the Maven repository
 

 Key: HADOOP-6954
 URL: https://issues.apache.org/jira/browse/HADOOP-6954
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.21.1

 Attachments: HADOOP-6954.patch


 When you try to close the staging repository to make it visible to the 
 public the operation fails (see 
 https://issues.apache.org/jira/browse/HDFS-1292?focusedCommentId=12909953page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12909953
  for the type of error).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6954) Sources JARs are not correctly published to the Maven repository

2010-10-26 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6954:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I've just committed this.

 Sources JARs are not correctly published to the Maven repository
 

 Key: HADOOP-6954
 URL: https://issues.apache.org/jira/browse/HADOOP-6954
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.21.1

 Attachments: HADOOP-6954.patch


 When you try to close the staging repository to make it visible to the 
 public the operation fails (see 
 https://issues.apache.org/jira/browse/HDFS-1292?focusedCommentId=12909953page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12909953
  for the type of error).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6757) NullPointerException for hadoop clients launched from streaming tasks

2010-10-26 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6757:
--

Status: Open  (was: Patch Available)

Marking as open while Sharad's comment is addressed.

 NullPointerException for hadoop clients launched from streaming tasks
 -

 Key: HADOOP-6757
 URL: https://issues.apache.org/jira/browse/HADOOP-6757
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: BZ-3620565-v1.0.patch, HADOOP-6757-v1.0.patch


 TaskRunner sets HADOOP_ROOT_LOGGER to info,TLA while launching the child 
 tasks. TLA implicitly assumes that that task-id information will be made 
 available via the 'hadoop.tasklog.taskid' parameter. 'hadoop.tasklog.taskid' 
 is passed to the child task by the TaskRunner via HADOOP_CLIENT_OPTS. When 
 the streaming task launches a hadoop client (say hadoop job -list), the 
 HADOOP_ROOT_LOGGER of the hadoop client is set to 'info,TLA' but 
 hadoop.tasklog.taskid is not set resulting into NPE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6916) Implement append operation for S3FileSystem

2010-10-25 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6916:
--

Status: Open  (was: Patch Available)

Marking as open while feedback is addressed.

 Implement append operation for S3FileSystem
 ---

 Key: HADOOP-6916
 URL: https://issues.apache.org/jira/browse/HADOOP-6916
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/s3
Affects Versions: 0.20.2, 0.22.0
Reporter: Oleg Aleshko
Priority: Minor
 Fix For: 0.22.0

 Attachments: s3_append1.patch


 Currently org.apache.hadoop.fs.s3.S3FileSystem#append throws an 
 IOException(Not supported);
 S3FileSystem should be able to support appending, possibly via common block 
 storage logic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-25 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6663:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
 Assignee: Kang Xiao
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Kang Xiao!

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-25 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6663:
--

Attachment: HADOOP-6663.patch

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch, 
 HADOOP-6663.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-25 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reopened HADOOP-6663:
---


Re-opening as there was a compilation problem.

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch, 
 HADOOP-6663.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6663) BlockDecompressorStream get EOF exception when decompressing the file compressed from empty file

2010-10-25 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6663:
--

Status: Patch Available  (was: Reopened)

Running updated patch through Hudson.

 BlockDecompressorStream get EOF exception when decompressing the file 
 compressed from empty file
 

 Key: HADOOP-6663
 URL: https://issues.apache.org/jira/browse/HADOOP-6663
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
Reporter: Kang Xiao
Assignee: Kang Xiao
 Fix For: 0.22.0

 Attachments: BlockDecompressorStream.java.patch, 
 BlockDecompressorStream.java.patch, BlockDecompressorStream.patch, 
 HADOOP-6663.patch


 An empty file can be compressed using BlockDecompressorStream, which is for 
 block-based compressiong algorithm such as LZO. However, when decompressing 
 the compressed file, BlockDecompressorStream get EOF exception.
 Here is a typical exception stack:
 java.io.EOFException
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:125)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:96)
 at 
 org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
 at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:18)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
 at org.apache.hadoop.mapred.Child.main(Child.java:196)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HADOOP-6917) Remove empty FTPFileSystemConfigKeys.java file

2010-10-18 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reassigned HADOOP-6917:
-

Assignee: Tom White

 Remove empty FTPFileSystemConfigKeys.java file
 --

 Key: HADOOP-6917
 URL: https://issues.apache.org/jira/browse/HADOOP-6917
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White

 FTPFileSystemConfigKeys.java is empty and not used so should be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6993) Broken link on cluster setup page of docs

2010-10-07 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12918987#action_12918987
 ] 

Tom White commented on HADOOP-6993:
---

+1

 Broken link on cluster setup page of docs
 -

 Key: HADOOP-6993
 URL: https://issues.apache.org/jira/browse/HADOOP-6993
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.21.0
Reporter: Aaron T. Myers
Assignee: Eli Collins
 Fix For: 0.21.1, 0.22.0

 Attachments: hadoop-6993-21-1.patch, hadoop-6993-22-1.patch


 The link on 
 http://hadoop.apache.org/common/docs/current/cluster_setup.html#Configuring+the+Hadoop+Daemons
  to core-default.xml is presently:
 {quote}
 http://hadoop.apache.org/common/docs/current/common-default.html
 {quote}
 but it should be:
 {quote}
 http://hadoop.apache.org/common/docs/current/core-default.html
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6933) TestListFiles is flaky

2010-10-07 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6933:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks, Todd!

 TestListFiles is flaky
 --

 Key: HADOOP-6933
 URL: https://issues.apache.org/jira/browse/HADOOP-6933
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.22.0

 Attachments: hadoop-6933.txt


 TestListFiles assumes a particular order of the files returned by the 
 directory iterator. There's no such guarantee made by the underlying API, so 
 the test fails on some hosts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6969) CHANGES.txt does not reflect the release of version 0.21.0.

2010-10-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-6969.
---

Resolution: Fixed

I've fixed this.

 CHANGES.txt does not reflect the release of version 0.21.0.
 ---

 Key: HADOOP-6969
 URL: https://issues.apache.org/jira/browse/HADOOP-6969
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: CHANGES.txt should show the release date for 0.21.0 and 
 include section for for 0.21.1 - Unreleased. Latest changes, that did not 
 make into 0.21.0, should be moved under 0.21.1 section.
Reporter: Konstantin Shvachko
Assignee: Tom White
 Fix For: 0.21.1




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6895) Native Libraries do not load if a different platform signature is returned from org.apache.hadoop.util.PlatformName

2010-10-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12918412#action_12918412
 ] 

Tom White commented on HADOOP-6895:
---

+1 looks good to me. It's worth adding a comment in the file which explains why 
this is needed. Same for HADOOP-6923.

 Native Libraries do not load if a different platform signature is returned 
 from org.apache.hadoop.util.PlatformName
 ---

 Key: HADOOP-6895
 URL: https://issues.apache.org/jira/browse/HADOOP-6895
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
 Environment: SLES 10, IBM Java 6, Hadoop 0.21.0-rc0
Reporter: Stephen Watt
Priority: Minor
 Fix For: 0.21.1, 0.22.0

 Attachments: HADOOP-6895.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 bin/hadoop-config.sh has an environment variable called JAVA_PLATFORM which 
 is set to to the results returned by org.apache.hadoop.util.PlatformName . 
 These results are sometimes unique to the JRE being used. Although the value 
 returned for 64 Bit Sun/Oracle Java and 64 Bit IBM Java is the same, it is 
 different for the corresponding 32 Bit JREs. 
 The issue is that the value returned is used in creating the path to the 
 native libraries on disk, i.e 
 ${HADOOP_COMMON_HOME}/lib/native/${JAVA_PLATFORM}
 Since the path on disk is fixed with the Sun JRE value 
 /lib/native/Linux-i386-32 it therefore fails when it attempts to load the 
 native libraries with the value returned with 32 Bit IBM Java, 
 /lib/native/Linux-x86-32

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6941) Hadoop 0.21 will not work on non-SUN JREs due to use of com.sun.security in org/apache/hadoop/security/UserGroupInformation.java

2010-10-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6941:
--

Fix Version/s: (was: 0.21.0)

 Hadoop 0.21 will not work on non-SUN JREs due to use of com.sun.security in 
 org/apache/hadoop/security/UserGroupInformation.java
 

 Key: HADOOP-6941
 URL: https://issues.apache.org/jira/browse/HADOOP-6941
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: SLES 11, Apache Harmony 6 and SLES 11, IBM Java 6
Reporter: Stephen Watt
 Fix For: 0.21.1, 0.22.0


 Attempting to format the namenode or attempting to start Hadoop using Apache 
 Harmony or the IBM Java JREs results in the following exception:
 10/09/07 16:35:05 ERROR namenode.NameNode: java.lang.NoClassDefFoundError: 
 com.sun.security.auth.UnixPrincipal
   at 
 org.apache.hadoop.security.UserGroupInformation.clinit(UserGroupInformation.java:223)
   at java.lang.J9VMInternals.initializeImpl(Native Method)
   at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setConfigurationParameters(FSNamesystem.java:420)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1240)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368)
 Caused by: java.lang.ClassNotFoundException: 
 com.sun.security.auth.UnixPrincipal
   at java.net.URLClassLoader.findClass(URLClassLoader.java:421)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:652)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:346)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:618)
   ... 8 more
 This is a negative regression as previous versions of Hadoop worked with 
 these JREs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6954) Sources JARs are not correctly published to the Maven repository

2010-10-05 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12918415#action_12918415
 ] 

Tom White commented on HADOOP-6954:
---

I'd like to commit this unless there are any objections.

 Sources JARs are not correctly published to the Maven repository
 

 Key: HADOOP-6954
 URL: https://issues.apache.org/jira/browse/HADOOP-6954
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.21.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.21.1

 Attachments: HADOOP-6954.patch


 When you try to close the staging repository to make it visible to the 
 public the operation fails (see 
 https://issues.apache.org/jira/browse/HDFS-1292?focusedCommentId=12909953page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12909953
  for the type of error).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-6980) Add abstraction layer to isolate cluster deployment mechanisms

2010-09-29 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916304#action_12916304
 ] 

Tom White commented on HADOOP-6980:
---

Whirr (http://incubator.apache.org/whirr/) is an example of a cluster 
deployment mechanism, so might be a good test case for such an API.

 Add abstraction layer to isolate cluster deployment mechanisms
 --

 Key: HADOOP-6980
 URL: https://issues.apache.org/jira/browse/HADOOP-6980
 Project: Hadoop Common
  Issue Type: Improvement
  Components: test
Affects Versions: 0.22.0
Reporter: Konstantin Boudnik

 Certain types of system tests might require to perform a fresh deployment of 
 a test cluster (e.g. upgrade tests, and similar).
 This can be achieved by having an external way of deploying clusters and then 
 running the tests. However, this won't work if re-deployment is needed in a 
 middle of such test execution. In this case, Herriot needs to be able to 
 explicitly call a deployment mechanism to carry on the process.
 However, there are many possible ways of implementing cluster deployment and 
 Herriot couldn't possibly be aware about all of them nor should be able to 
 satisfy all their different interfaces. Thus an abstract interface should 
 isolate plug-gable concrete implementations.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6951) Distinct minicluster services (e.g. NN and JT) overwrite each other's service policies

2010-09-29 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-6951.
---

Resolution: Fixed

 Distinct minicluster services (e.g. NN and JT) overwrite each other's service 
 policies
 --

 Key: HADOOP-6951
 URL: https://issues.apache.org/jira/browse/HADOOP-6951
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0

 Attachments: hadoop-6951.1.txt, hadoop-6951.2.txt, hadoop-6951.txt.0


 Because the protocol - ACL mapping in ServiceAuthorizationManager is static, 
 services which are run in the same JVM have the potential to clobber the 
 other's service authorization ACLs whenever 
 ServiceAuthorizationManager.refresh() is called. This causes authorization 
 failures if one tries to launch a 2NN connected to a minicluster with 
 hadoop.security.authorization enabled. Seems like each service should have 
 its own instance of a ServiceAuthorizationManager, instead of using static 
 methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HADOOP-6969) CHANGES.txt does not reflect the release of version 0.21.0.

2010-09-24 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White reassigned HADOOP-6969:
-

Assignee: Tom White

 CHANGES.txt does not reflect the release of version 0.21.0.
 ---

 Key: HADOOP-6969
 URL: https://issues.apache.org/jira/browse/HADOOP-6969
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
 Environment: CHANGES.txt should show the release date for 0.21.0 and 
 include section for for 0.21.1 - Unreleased. Latest changes, that did not 
 make into 0.21.0, should be moved under 0.21.1 section.
Reporter: Konstantin Shvachko
Assignee: Tom White
 Fix For: 0.21.1




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-6951) Distinct minicluster services (e.g. NN and JT) overwrite each other's service policies

2010-09-24 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-6951:
--

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

I've just committed this. Thanks, Aaron!

(I checked that the unit tests passed before committing this.)

 Distinct minicluster services (e.g. NN and JT) overwrite each other's service 
 policies
 --

 Key: HADOOP-6951
 URL: https://issues.apache.org/jira/browse/HADOOP-6951
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0

 Attachments: hadoop-6951.1.txt, hadoop-6951.2.txt, hadoop-6951.txt.0


 Because the protocol - ACL mapping in ServiceAuthorizationManager is static, 
 services which are run in the same JVM have the potential to clobber the 
 other's service authorization ACLs whenever 
 ServiceAuthorizationManager.refresh() is called. This causes authorization 
 failures if one tries to launch a 2NN connected to a minicluster with 
 hadoop.security.authorization enabled. Seems like each service should have 
 its own instance of a ServiceAuthorizationManager, instead of using static 
 methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



<    1   2   3   4   5   6   7   8   9   >