[jira] [Resolved] (HADOOP-7551) LocalDirAllocator should incorporate LocalStorage

2011-11-03 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7551.
-

Resolution: Won't Fix

Per MAPREDUCE-3011 the if the conf given to the LocalDirAllocator is always 
updated by callers (eg with input based on LocalStorage), as is currently the 
case, it doesn't need to be aware. It would be good if LocalDirAllocator didn't 
need to be given a new conf but combining LocalDirAllocator and LocalStorage is 
probably too invasive for a stable release. Closing as won't fix. 

 LocalDirAllocator should incorporate LocalStorage
 -

 Key: HADOOP-7551
 URL: https://issues.apache.org/jira/browse/HADOOP-7551
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.20.204.0
Reporter: Eli Collins

 The o.a.h.fs.LocalDirAllocator is not aware of o.a.h.m.t.LocalStorage 
 (introduced in MAPREDUCE-2413) - it always considers the configured local 
 dirs, not just the ones that happen to be good. Therefore if there's a disk 
 failure then *every* call to get a local path will result in 
 LocalDirAllocator#confChanged doing a disk check of *all* the configured 
 local dirs. It seems like LocalStorage should be a private class to 
 LocalAllocator so that all users of LocalDirAllocator benefit from the disk 
 failure handling and all the various users of LocalDirAllocator don't have to 
 be modified to handle disk failures. Note that LocalDirAllocator already 
 handles faulty directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: FileSystem contract of listStatus

2011-11-02 Thread Eli Collins
Hey Noah,

HDFS returns items in lexographic order by byte (see
INode#compareBytes) but I don't think ordering was intended to be an
explicit part of the contract. Ie the test probably just needs to be
modified to ignore the order.

RawLocalFileSystem uses Java's File#list which has no guarantee that
the name strings in the resulting array will appear in any specific
order; they  are not, in particular, guaranteed to appear in
alphabetical order., however the FSContractBaseTest isn't run against
local file systems which is why it probably never came up.

Thanks,
Eli

On Wed, Nov 2, 2011 at 3:57 PM, Noah Watkins jayh...@soe.ucsc.edu wrote:
 I have a question about the FileSystem contract in 0.20.

 In FileSystemContractBaseBaseTest:testFileStatus() there
 are several files created, and afterwards the test confirms
 that they are present. Here is the relevant code:

    FileStatus[] paths = fs.listStatus(path(/test));

    paths = fs.listStatus(path(/test/hadoop));
    assertEquals(3, paths.length);
    assertEquals(path(/test/hadoop/a), paths[0].getPath());
    assertEquals(path(/test/hadoop/b), paths[1].getPath());
    assertEquals(path(/test/hadoop/c), paths[2].getPath());

 This test will fail if the results are not in the specific
 order. Is this ordering (alphanumeric?) part of the contract?
 Can FileSystem return results from listStatus() in any order?

 Thanks,
 Noah



[jira] [Created] (HADOOP-7796) HADOOP-7773 introduced 7 new findbugs warnings

2011-11-01 Thread Eli Collins (Created) (JIRA)
HADOOP-7773 introduced 7 new findbugs warnings
--

 Key: HADOOP-7796
 URL: https://issues.apache.org/jira/browse/HADOOP-7796
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.23.0, 0.24.0
Reporter: Eli Collins



CodeWarning
Se  com.google.protobuf.ByteString stored into non-transient field 
HadoopRpcProtos$HadoopRpcExceptionProto.exceptionName_
Se  com.google.protobuf.ByteString stored into non-transient field 
HadoopRpcProtos$HadoopRpcExceptionProto.stackTrace_
Se  Class 
org.apache.hadoop.ipc.protobuf.HadoopRpcProtos$HadoopRpcRequestProto defines 
non-transient non-serializable instance field request_
Se  com.google.protobuf.ByteString stored into non-transient field 
HadoopRpcProtos$HadoopRpcRequestProto.methodName_
Se  Class 
org.apache.hadoop.ipc.protobuf.HadoopRpcProtos$HadoopRpcResponseProto defines 
non-transient non-serializable instance field response_

Dodgy Warnings

CodeWarning
UCF Useless control flow in 
org.apache.hadoop.ipc.protobuf.HadoopRpcProtos$HadoopRpcExceptionProto$Builder.maybeForceBuilderInitialization()
UCF Useless control flow in 
org.apache.hadoop.ipc.protobuf.HadoopRpcProtos$HadoopRpcRequestProto$Builder.maybeForceBuilderInitialization()


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7783) Tests for HDFS-2514

2011-10-29 Thread Eli Collins (Created) (JIRA)
Tests for HDFS-2514
---

 Key: HADOOP-7783
 URL: https://issues.apache.org/jira/browse/HADOOP-7783
 Project: Hadoop Common
  Issue Type: Test
  Components: fs
Affects Versions: 0.21.0, 0.22.0, 0.23.0
Reporter: Eli Collins
Assignee: Eli Collins


This covers the tests for HDFS-2514.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Avro compilation errors building from eclipse

2011-10-26 Thread Eli Collins
Hey gang,

I refreshed my trunk trees and ran mvn eclipse:eclipse am seeing the
old Avro compilation errors in TestAvroSerialization and friends. I've
got target/generated-sources/java as a source dir which I thought was
sufficient to get those classes on the classpath.   Did something in
the build change recently?

Thanks,
Eli


Re: Avro compilation errors building from eclipse

2011-10-26 Thread Eli Collins
Thanks, that fixed it.   Had to run mvn test -DskipTests to generate
generated-test-sources.

On Wed, Oct 26, 2011 at 4:26 PM, Arun C Murthy a...@hortonworks.com wrote:
 Just add generated-test-sources (manually) to your srcs.

 On Oct 26, 2011, at 4:18 PM, Eli Collins wrote:

 Hey gang,

 I refreshed my trunk trees and ran mvn eclipse:eclipse am seeing the
 old Avro compilation errors in TestAvroSerialization and friends. I've
 got target/generated-sources/java as a source dir which I thought was
 sufficient to get those classes on the classpath.   Did something in
 the build change recently?

 Thanks,
 Eli




Re: Development basis / rebuilding Cloudera dist

2011-10-23 Thread Eli Collins
Hey Tim,

+ cdh-user@ where someone can help with your specific issue. (bcc common-dev).

You may also want to check out Apache Bigtop:
http://incubator.apache.org/bigtop

Thanks,
Eli

On Fri, Oct 21, 2011 at 10:30 PM, Tim Broberg tbrob...@yahoo.com wrote:
 I'd like to add a core module to hadoop, but I'm running into some issues 
 getting started.

 What I want is to be able to add a native library and codec to some stable 
 build of hadoop, build, debug, experiment, and benchmark.

 Currently, I'm trying to rebuild the Cloudera rpms so I can get a complete 
 stable set of source to start from. (When I tried working from the SVN trunk, 
 it seemed there was so much active development going on, it was hard to get 
 something stable to compile.)

 So, I'm working from the Cloudera instructions - 
 https://ccp.cloudera.com/display/CDHDOC/Building+RPMs+from+CDH+Source+RPMs.

 I downloaded hadoop-0.20-0.20.2+923.97-1.src.rpm, installed jdk, ant, maven, 
 and set various environment variables:
     export PATH=$PATH:/usr/local/apache-maven-3.0.3/bin
     export JAVA_HOME=/usr/java/jdk1.7.0
     export HADOOP_HOME=/usr/lib/hadoop-0.20
     export HADOOP_VERSION=`hadoop version | head -n 1 | cut -f 2 -d  `

 I wasn't sure what to do about ANT_HOME, FORREST_HOME, or JAVA5_HOME as 
 forrest wasn't requested to be installed, ant doesn't appear to have a 
 special directory anywhere, and just generally not sure what's up with 
 JAVA5_HOME, but this appears to be forrest related? None of these is 
 generating complaints when I build.

 When I build, I get a whole bunch of output including the following:

 rpmbuild --rebuild $SRPM

 .snip

 compile:
  [echo] contrib: gridmix
     [javac] Compiling 31 source files to 
 /home/tbroberg/rpmbuild/BUILD/hadoop-0.20.2-cdh3u1/build/contrib/gridmix/classes
     [javac] 
 /home/tbroberg/rpmbuild/BUILD/hadoop-0.20.2-cdh3u1/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:396:
  error: type argument ? extends T is not within bounds of type-variable E
     [javac]   private T String getEnumValues(Enum? extends T[] e) {
     [javac] ^
     [javac]   where T,E are type-variables:
     [javac] T extends Object declared in method TgetEnumValues(Enum? 
 extends T[])
     [javac] E extends EnumE declared in class Enum
     [javac] 
 /home/tbroberg/rpmbuild/BUILD/hadoop-0.20.2-cdh3u1/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:399:
  error: type argument ? extends T is not within bounds of type-variable E
     [javac] for (Enum? extends T v : e) {
     [javac]   ^
     [javac]   where T,E are type-variables:
     [javac] T extends Object declared in method TgetEnumValues(Enum? 
 extends T[])
     [javac] E extends EnumE declared in class Enum
     [javac] Note: Some input files use unchecked or unsafe operations.
     [javac] Note: Recompile with -Xlint:unchecked for details.
     [javac] 2 errors

 Questions:
 1 - Is there a more appropriate environment to work from than the Cloudera 
 distribution for developing codecs?
 2 - If not, is this an appropriate place to ask about how to build Cloudera?
 3 - Any suggestions for getting this rpm to rebuild?
 4 - Any suggestions for editing the rpm so I can just wipe out gridmix 
 altogether?

 TIA
     - Tim.


Re: Hadoop SUN JAVA dependency

2011-10-20 Thread Eli Collins
Hey Amir,

The jiras you're looking for are:

HADOOP-6941. Support non-SUN JREs in UserGroupInformation
HADOOP-7211. Security uses proprietary Sun APIs

Also see http://wiki.apache.org/hadoop/HadoopJavaVersions

This needs to go in trunk as well as 20x.

Thanks,
Eli

On Thu, Oct 20, 2011 at 8:09 AM, Amir Sanjar v1san...@us.ibm.com wrote:

 Change Proposal,
        Removing SUN JAVA 6.0 dependencies from hadoop security sub-project.

 Targeted release
        hadoop 0.20.206.0+

 Targeted platforms:
        x86, PowerPC
        RHEL 6.x and SLES 11 x

 Next step
        Open Jira bug/feature report, if none has been opened

 Any comments?


 Best Regards
 Amir Sanjar

 Linux System Management Architect and Lead
 IBM Senior Software Engineer
 Phone# 512-286-8393
 Fax#      512-838-8858



[jira] [Resolved] (HADOOP-7634) Cluster setup docs specify wrong owner for task-controller.cfg

2011-09-19 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7634.
-

   Resolution: Fixed
Fix Version/s: (was: 0.20.205.0)
   0.20.206.0
 Hadoop Flags: [Reviewed]

Thanks atm. I've committed this.

 Cluster setup docs specify wrong owner for task-controller.cfg 
 ---

 Key: HADOOP-7634
 URL: https://issues.apache.org/jira/browse/HADOOP-7634
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation, security
Affects Versions: 0.20.204.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 0.20.206.0

 Attachments: hadoop-7634.patch


 The cluster setup docs indicate task-controller.cfg must be owned by the user 
 running TaskTracker but the code checks for root. We should update the docs 
 to reflect the real requirement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7647) Update bylaws to reflect PMC chair rotation

2011-09-16 Thread Eli Collins (JIRA)
Update bylaws to reflect PMC chair rotation
---

 Key: HADOOP-7647
 URL: https://issues.apache.org/jira/browse/HADOOP-7647
 Project: Hadoop Common
  Issue Type: Task
  Components: documentation
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: hadoop-7647.patch

The PMC voted to rotate the chair annually. Let's update the bylaws accordingly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7650) Automated test for the tarballs

2011-09-16 Thread Eli Collins (JIRA)
Automated test for the tarballs
---

 Key: HADOOP-7650
 URL: https://issues.apache.org/jira/browse/HADOOP-7650
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Reporter: Eli Collins


Currently we don't test the generated tarball so we don't automatically detect 
changes that break the artifact generation or bin scripts (eg HADOOP-7356). We 
should have a simple functional test run from Jenkins that covers this stuff.

I wrote some simple automation for 20x tarballs that starts a pseudo cluster 
with multiple DNs and TTs and runs basic HDFS commands and MR jobs (with and 
w/o the LinuxTaskController) that could be used as a starting point. Code lives 
here: https://github.com/elicollins/hadoop-dev/tree/pseudo

Eric also added a lot of cluster config generation in HADOOP-7599 that could be 
used as a starting point as well.

This should also support trunk (Yarn) of course.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7640) PluginDispatcher should identify which class could not be found

2011-09-15 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7640.
-

Resolution: Invalid

Depends on HADOOP-5640 which isn't in trunk. 

 PluginDispatcher should identify which class could not be found
 ---

 Key: HADOOP-7640
 URL: https://issues.apache.org/jira/browse/HADOOP-7640
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Patrick Angeles
Priority: Minor

 Right now, you get a rather generic:
 Unable to load dfs.namenode.plugins plugins
 This is usually due to class not found issues. It would be helpful to 
 identify the specific class that could not be found.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: jenkins testing on 0.20.20x

2011-09-15 Thread Eli Collins
See HADOOP-7435.

On Thu, Sep 15, 2011 at 9:44 AM, Steve Loughran ste...@apache.org wrote:
 If I have some patches for the 0.20.20x branch, how do I submit them so they
 get applied and tested on that branch, rather than trunk. I have a patch
 that I want to get in there, but the JIRA process doesn't like the
 (unappliable to trunk patch). I could get the patch into trunk first and
 backport it, but really you want both branches' patches auto-reviewed



Re: [PROPOSAL] Two Jira infrastructure additions to support sustaining bug fixes

2011-09-15 Thread Eli Collins
Hey Matt,

Thanks for the proposal, agree we should sort these out.

Wrt #1 IIUC the new workflow would be to use Target Version like we
use Fix Version today, but only set the Fix Version when we actually
commit to the given branch for the release. Seems reasonable.
Definitely better than creating a separate jira per branch.

Wrt #2 I think we can handle this by people following the patch naming
guidelines (in http://wiki.apache.org/hadoop/HowToContribute) and
closing out HADOOP-7435.

Thanks,
Eli

On Thu, Sep 15, 2011 at 11:58 AM, Matt Foley mfo...@hortonworks.com wrote:
 Hi all,
 for better or worse, the Hadoop community works in multiple branches.  We
 have to do sustaining work on 0.20, even while we hope that 0.23 will
 finally replace it.  Even after that happens, we will then need to do
 sustaining releases on 0.23 while future development goes into 0.24 or 0.25,
 and so on.

 This is the price we pay for having this stuff actually in use in
 production.  That's a good thing!
 And it's been that way in every software company I've worked in.

 My current efforts as release manager for 0.20.205 have made a couple
 deficiencies in our Jira infrastructure painfully obvious.  So I would like
 to propose two changes that will make it way easier and more reliable to
 handle patches for sustaining bug fixes.  But I wanted to bounce them off
 you and make sure they have community support before asking the
 Infrastructure team to look at them.


 1. Add a custom field Target Version/s [list].

 Motivation: When making a release, one wants to query all Jiras marked fixed
 in this release.  This can't be done reliably with current usage.
 Also, one wants to be able to query all open issues targeted for a given
 branch.  This can't be done reliably either.

 Why current usage is deficient:  Currently we have Affects Version/s and
 Fix Version/s.  But the Fix Versions is being overloaded.  It is used to
 mean should be fixed in (target versions) while the bug is open, and is
 fixed in (fix versions) after the bug is resolved.  That's fine if there's
 only one branch in use.  But if a fix is targeted for both A and B, and it's
 actually committed to A but not yet to B, there's no way to query the state
 of the fix.  The bug appears open for both (or sometimes it's incorrectly
 closed for both!).  You have to manually visit the individual bug report and
 review the SubversionCommits.  This might be automatable, but it sure isn't
 easily expressed.

 If we add a Target Versions field, then intent and completion can be
 separately marked (in the Target Versions and Fix Versions, respectively),
 and simple queries can clearly differentiate the cases.


 2. Add target branch/s field to Attachments metadata (or if that's not
 feasible, establish naming convention for Attachments to include this info)

 Motivation: Enable CI test-patch on patches targeted for non-trunk, as well
 as make the target clear to human viewers.

 If this field can be added (I'm not sure Jira supports it), I suggest adding
 it to the Attach Files dialogue box, and displaying it in the Attachments
 and Manage Attachments views. If the Infra team says Jira can't support it,
 then we (Hadoop dev) should talk about an unambiguous naming convention.

 If this meta-datum were available, it should be fairly easy to modify the
 automated test-patch process to test each patch against its intended target
 branch. (This process is managed internally by members of the Hadoop dev
 team, and I can help with it.)  This would give the usual benefits of CI to
 our sustaining processes as well as mainstream development.


 If you like either or both of these ideas, kindly +1 them.  If it's a bad
 idea, by all means say why.
 Absent negative feedback, I'm planning to open Infrastructure requests in a
 few days.



Re: [PROPOSAL] Two Jira infrastructure additions to support sustaining bug fixes

2011-09-15 Thread Eli Collins
On Thu, Sep 15, 2011 at 1:44 PM, Matt Foley mfo...@hortonworks.com wrote:
 On Thu, Sep 15, 2011 at 1:20 PM, Eli Collins e...@cloudera.com wrote:

 Hey Matt,

 Thanks for the proposal, agree we should sort these out.

 Wrt #1 IIUC the new workflow would be to use Target Version like we
 use Fix Version today, but only set the Fix Version when we actually
 commit to the given branch for the release.


 Exactly.


 Seems reasonable.
 Definitely better than creating a separate jira per branch.

 Wrt #2 I think we can handle this by people following the patch naming
 guidelines (in http://wiki.apache.org/hadoop/HowToContribute) and
 closing out HADOOP-7435.


 I'm okay with that.  And that change to Jira would probably be hard to get
 accepted by Infra anyway.

 I've transcribed the patch naming convention into HADOOP-7435, and assigned
 it to myself.



Awesome.  +1


 Thanks,
 --Matt

 Thanks,
 Eli

 On Thu, Sep 15, 2011 at 11:58 AM, Matt Foley mfo...@hortonworks.com
 wrote:
  Hi all,
  for better or worse, the Hadoop community works in multiple branches.  We
  have to do sustaining work on 0.20, even while we hope that 0.23 will
  finally replace it.  Even after that happens, we will then need to do
  sustaining releases on 0.23 while future development goes into 0.24 or
 0.25,
  and so on.
 
  This is the price we pay for having this stuff actually in use in
  production.  That's a good thing!
  And it's been that way in every software company I've worked in.
 
  My current efforts as release manager for 0.20.205 have made a couple
  deficiencies in our Jira infrastructure painfully obvious.  So I would
 like
  to propose two changes that will make it way easier and more reliable to
  handle patches for sustaining bug fixes.  But I wanted to bounce them off
  you and make sure they have community support before asking the
  Infrastructure team to look at them.
 
 
  1. Add a custom field Target Version/s [list].
 
  Motivation: When making a release, one wants to query all Jiras marked
 fixed
  in this release.  This can't be done reliably with current usage.
  Also, one wants to be able to query all open issues targeted for a given
  branch.  This can't be done reliably either.
 
  Why current usage is deficient:  Currently we have Affects Version/s
 and
  Fix Version/s.  But the Fix Versions is being overloaded.  It is used
 to
  mean should be fixed in (target versions) while the bug is open, and
 is
  fixed in (fix versions) after the bug is resolved.  That's fine if
 there's
  only one branch in use.  But if a fix is targeted for both A and B, and
 it's
  actually committed to A but not yet to B, there's no way to query the
 state
  of the fix.  The bug appears open for both (or sometimes it's incorrectly
  closed for both!).  You have to manually visit the individual bug report
 and
  review the SubversionCommits.  This might be automatable, but it sure
 isn't
  easily expressed.
 
  If we add a Target Versions field, then intent and completion can be
  separately marked (in the Target Versions and Fix Versions,
 respectively),
  and simple queries can clearly differentiate the cases.
 
 
  2. Add target branch/s field to Attachments metadata (or if that's not
  feasible, establish naming convention for Attachments to include this
 info)
 
  Motivation: Enable CI test-patch on patches targeted for non-trunk, as
 well
  as make the target clear to human viewers.
 
  If this field can be added (I'm not sure Jira supports it), I suggest
 adding
  it to the Attach Files dialogue box, and displaying it in the
 Attachments
  and Manage Attachments views. If the Infra team says Jira can't support
 it,
  then we (Hadoop dev) should talk about an unambiguous naming convention.
 
  If this meta-datum were available, it should be fairly easy to modify the
  automated test-patch process to test each patch against its intended
 target
  branch. (This process is managed internally by members of the Hadoop dev
  team, and I can help with it.)  This would give the usual benefits of CI
 to
  our sustaining processes as well as mainstream development.
 
 
  If you like either or both of these ideas, kindly +1 them.  If it's a bad
  idea, by all means say why.
  Absent negative feedback, I'm planning to open Infrastructure requests in
 a
  few days.
 




[jira] [Created] (HADOOP-7634) Cluster setup docs specify wrong owner for task-controller.cfg

2011-09-13 Thread Eli Collins (JIRA)
Cluster setup docs specify wrong owner for task-controller.cfg 
---

 Key: HADOOP-7634
 URL: https://issues.apache.org/jira/browse/HADOOP-7634
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation, security
Affects Versions: 0.20.204.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 0.20.205.0
 Attachments: hadoop-7634.patch

The cluster setup docs indicate task-controller.cfg must be owned by the user 
running TaskTracker but the code checks for root. We should update the docs to 
reflect the real requirement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: JIRA attachments order

2011-09-12 Thread Eli Collins
On Mon, Sep 12, 2011 at 9:33 AM, Eric Payne er...@yahoo-inc.com wrote:
 So, if a patch is the same for trunk plus one or more branches, would we need 
 to upload multiple patches?


Correct.  In my experience its been very rare that one patch applies
to trunk and a stable branch, and trunk to a feature branch is
normally done by a merge.

Thanks,
Eli


Re: JIRA attachments order

2011-09-09 Thread Eli Collins
How about we the how to contribute page with a simple standard?

jira-xyz.patch  # for trunk
jira-xyz-branch.patch  # for a release branch, could use a shortened
name, eg 20x for branch-20-security and append for
branch-20-append.

Thanks,
Eli

On Fri, Sep 9, 2011 at 10:32 AM, Robert Evans ev...@yahoo-inc.com wrote:
 Can I ask, though that we do add branch information in the patches.  Too 
 often a patch is intended to apply to some branch other then trunk, and there 
 is no easy way to tell what branch it was intended for.

 --Bobby Evans


 On 9/9/11 10:52 AM, Mattmann, Chris A (388J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Wow, I didn't know that!

 Learn something new everyday, thanks guys.

 Cheers,
 Chris

 On Sep 9, 2011, at 9:48 AM, Doug Cutting wrote:

 On 09/09/2011 07:27 AM, Ted Dunning wrote:
 If you post the same patch with the same name, JIRA helps you out by greying
 all the earlier versions out.

 Indeed.  That's the best practice, not to add version numbers to patch
 files, for this very reason.  We should perhaps note this on:

 http://wiki.apache.org/hadoop/HowToContribute

 I am a Jira administrator and would be happy to change the default
 ordering of attachments if it were possible, however I can see no option
 to do so.

 Doug


 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++





Re: JIRA attachments order

2011-09-09 Thread Eli Collins
Personally I like version numbers as well,  it allows me to refer to a
specific version of the patch (vs a patch on a given time of
date/date).

I also notice some people use .txt which browsers can view in place vs
.patch which will download by default unless you register a viewer.

Thanks,
Eli

On Fri, Sep 9, 2011 at 11:08 AM, Ravi Prakash ravihad...@gmail.com wrote:
 But what if I want to see an incremental diff between two patches? I don't
 want to review the whole patch everytime. Maybe I just want to re-review the
 changes made to a patch. I would then have sort the patches manually using
 time. I think its better to have version numbers in that case


 On Fri, Sep 9, 2011 at 10:48 AM, Doug Cutting cutt...@apache.org wrote:

 On 09/09/2011 07:27 AM, Ted Dunning wrote:
  If you post the same patch with the same name, JIRA helps you out by
 greying
  all the earlier versions out.

 Indeed.  That's the best practice, not to add version numbers to patch
 files, for this very reason.  We should perhaps note this on:

 http://wiki.apache.org/hadoop/HowToContribute

 I am a Jira administrator and would be happy to change the default
 ordering of attachments if it were possible, however I can see no option
 to do so.

 Doug




Re: JIRA attachments order

2011-09-09 Thread Eli Collins
On Fri, Sep 9, 2011 at 12:52 PM, Doug Cutting cutt...@apache.org wrote:
 On 09/09/2011 11:12 AM, Eli Collins wrote:
 Personally I like version numbers as well,  it allows me to refer to a
 specific version of the patch (vs a patch on a given time of
 date/date).

 Re-using the name doesn't hide the old versions, it just makes them
 gray.  They're still listed, with date and may be sorted by date.  If
 you select the ActivityAll tab then the different versions are linked
 to in the comment stream, providing context.

 90+% of the time I'm interested in the most recent version of the patch,
 so the value of having it highlighted is great.  Frequently when
 different names are used I mistakenly download the wrong version and
 waste time reviewing it, but when the same name is used I always get the
 most recent.  The highlighting is very effective for me.


I'm cool w/ adopting the names w/o versions. Try to standardize on one
form would be easier for everyone, especially new contributors.

Anyone object to me updating HowToContribute with the following?

Patches for trunk should be named:   jira-xyz.patch
eg hdfs-123.patch

Patches for a specific branch should be named:  jira-xyz-branch.patch
where branch may be abbreviated, eg hdfs-123-security.patch

I'll indicate the rationale wrt jira so people know why it's this way.

Thanks,
Eli


Re: JIRA attachments order

2011-09-09 Thread Eli Collins
On Fri, Sep 9, 2011 at 1:15 PM, Aaron T. Myers a...@cloudera.com wrote:
 On Fri, Sep 9, 2011 at 12:57 PM, Eli Collins e...@cloudera.com wrote:

 Patches for a specific branch should be named:  jira-xyz-branch.patch
 where branch may be abbreviated, eg hdfs-123-security.patch


 +1, if we ever hope to implement HADOOP-7435 [1], it will be necessary to
 standardize the branch-name-in-patch-name scheme.


Good point.  One way to enforce this is for Jenkins to only run tests
against patches that follow this naming scheme, and to put the actual
branch name in the patch (where no branch means trunk). Ie
xyz-123.patch will run against trunk, xyz-123-branchx.patch will run
against branch x and all other patches will be ignored by Jenkins.
Someone tedious eg for branch-20-security but means we dont have to
maintain a mapping.

Thanks,
Eli



 --
 Aaron T. Myers
 Software Engineer, Cloudera

 [1] https://issues.apache.org/jira/browse/HADOOP-7435



Re: JIRA attachments order

2011-09-09 Thread Eli Collins
I updated HowToContribute, it people like this prose I'll advertise
the change to *-dev.

Thanks,
Eli

On Fri, Sep 9, 2011 at 2:53 PM, Eli Collins e...@cloudera.com wrote:
 On Fri, Sep 9, 2011 at 1:15 PM, Aaron T. Myers a...@cloudera.com wrote:
 On Fri, Sep 9, 2011 at 12:57 PM, Eli Collins e...@cloudera.com wrote:

 Patches for a specific branch should be named:  jira-xyz-branch.patch
 where branch may be abbreviated, eg hdfs-123-security.patch


 +1, if we ever hope to implement HADOOP-7435 [1], it will be necessary to
 standardize the branch-name-in-patch-name scheme.


 Good point.  One way to enforce this is for Jenkins to only run tests
 against patches that follow this naming scheme, and to put the actual
 branch name in the patch (where no branch means trunk). Ie
 xyz-123.patch will run against trunk, xyz-123-branchx.patch will run
 against branch x and all other patches will be ignored by Jenkins.
 Someone tedious eg for branch-20-security but means we dont have to
 maintain a mapping.

 Thanks,
 Eli



 --
 Aaron T. Myers
 Software Engineer, Cloudera

 [1] https://issues.apache.org/jira/browse/HADOOP-7435




Re: Hadoop Tools Layout (was Re: DistCpV2 in 0.23)

2011-09-06 Thread Eli Collins
On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer a...@apache.org wrote:

 On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote:
 We still need to answer Amareshwari's question (2) she asked some time back
 about the automated code compilation and test execution of the tools module.



 My #1 question is if tools is basically contrib reborn.  If not, what
 makes
 it different?


        I'm still waiting for this answer as well.

        Until such, I would be pretty much against a tools module.  Changing 
 the name of the dumping ground doesn't make it any less of a dumping ground.

IMO if the tools module only gets stuff like distcp that's maintained
then it's not contrib, if it contains all the stuff from the current
MR contrib then tools is just a re-labeling of contrib. Given that
this proposal only covers moving distcp to tools it doesn't sound like
contrib to me.

Thanks,
Eli


Re: System tests on 0.20.20x releases

2011-08-22 Thread Eli Collins
On Sun, Aug 21, 2011 at 9:34 PM, Konstantin Boudnik c...@apache.org wrote:
 System tests (Herriot controlled) tests were a part of nightly testing of 
 every
 build for at least 2 of .2xx release. I really can not comment on .203 and
 after.

Owen - are you running the system tests on the 20x release candidates?
Do we know if the 20x release pass the system tests?

 A normal procedure was to build a normal bits and run the tests; build
 instrumented bits, deploy them to a 10 nodes cluster, and run system tests.
 The current state of the code is that system tests require source code
 workspace to be executed from. I have done some initial work to do workspace
 independent testing but I don't know if it has been included to the public
 releases of .203+ - I haven't really checked.

 At any rate, running system tests are an easy task and the wiki page is
 explaining how to do it.

Running the system tests is actually not easy, those wiki instructions
are out of date, require all kinds of manual steps, and some of the
tests fail when just run from a local build (ie they require 3 DNs so
you have to setup a cluster).

 Assembling an instrumented cluster on the other hand
 requires certain knowledge and release process and bits production.
 Instrumented cluster isn't fault-injected - it is just instrumented ;) Yes, it
 contains a few extra helper API calls in a few classes, which exactly makes
 them a way more useful for the testing purpose. Without those a number of
 testing scenarios would be impossible to implement as I have explained it on
 many occasions.

Could you point me to a thread that covers the few extra helper API
calls that are injected?  I can't see what API would both be necessary
for a system test and also not able be included in the product itself.
 If you're system testing an instrumented build than you're not system
testing the product used by users.


 For the regular runs of system test Roman and I have created a regular
 deployment of 0.22 cluster builds under  Apache Hudson control a few months
 ago. I don't know what's going on with this testing after recent troubles with
 the build machines.

How hard would it be to copy your 22 system test Jenkins job to adapt
it to use a  20x build?  Seems like the test bits should mostly be the
same.

Thanks,
Eli


[jira] [Created] (HADOOP-7569) Remove common start-all.sh

2011-08-22 Thread Eli Collins (JIRA)
Remove common start-all.sh
--

 Key: HADOOP-7569
 URL: https://issues.apache.org/jira/browse/HADOOP-7569
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Eli Collins
Priority: Minor
 Fix For: 0.23.0


MAPREDUCE-2736 removes start-mapred.sh. We should either update the call to 
start-mapred to hadoop-yarn/bin/start-all.sh instead or just remove the script 
since it's deprecated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7570) hadoop-config.sh needs to be updated post MR2

2011-08-22 Thread Eli Collins (JIRA)
hadoop-config.sh needs to be updated post MR2
-

 Key: HADOOP-7570
 URL: https://issues.apache.org/jira/browse/HADOOP-7570
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Eli Collins
Priority: Blocker
 Fix For: 0.23.0


hadoop-common/src/main/bin/hadoop-config.sh needs to be updated post MR2 (eg 
the layout of mapred home has changed).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7571) hadoop-config.sh needs to be updated post mavenization

2011-08-22 Thread Eli Collins (JIRA)
hadoop-config.sh needs to be updated post mavenization
--

 Key: HADOOP-7571
 URL: https://issues.apache.org/jira/browse/HADOOP-7571
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Eli Collins
Priority: Blocker
 Fix For: 0.23.0


hadoop-common/src/main/bin/hadoop-config.sh needs to be updated post 
mavenization (eg it still refers to build/classes etc).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7572) Ability to run the daemons from source trees

2011-08-22 Thread Eli Collins (JIRA)
Ability to run the daemons from source trees


 Key: HADOOP-7572
 URL: https://issues.apache.org/jira/browse/HADOOP-7572
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Eli Collins


It's very useful for developers to be able to run the daemons w/o 
building/deploying tarballs. This feature used to work, is now very out of date 
post RPM and maven related changes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7321) The link to the documentation on the hadoop common page is broken

2011-08-19 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7321.
-

Resolution: Fixed

 The link to the documentation on the hadoop common page is broken
 -

 Key: HADOOP-7321
 URL: https://issues.apache.org/jira/browse/HADOOP-7321
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Roko Kruze

 Currently the following link is broken for the documentation: 
 http://hadoop.apache.org/common/docs/stable/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




System tests on 0.20.20x releases

2011-08-19 Thread Eli Collins
Has anyone tried running the system tests on a 0.20.20x release?  Why
don't we run these via Hudson?

After following the instructions on the wiki [1] and making a bunch of
additional fixes (setting dfs.datanode.ipc.address in the config,
using sbin instead of bin, copying libs into the FI build lib dir,
etc) I was able to get the tests running however the tests seem to
have bitrot.

The reason I ask is that it looks like the src/test/system tests are
only compiled or run via the test-system target and it doesn't look
like Hudson or developers use that target, therefore we're not doing
anything to prevent people from breaking the tests. I tried to run
them to see if one of my changes would break them but I can't imagine
most people will jump through all the above hoops.

On a related note, is there any way to run these against an existing
build/cluster? It looks like they require running on a build that's
been fault injected (ie they use custom protocol classes that are not
present in the normal tarball) which makes them much less useful.

Thanks,
Eli

1. http://wiki.apache.org/hadoop/HowToUseSystemTestFramework


[jira] [Created] (HADOOP-7548) ant binary target fails if native has not been built

2011-08-17 Thread Eli Collins (JIRA)
ant binary target fails if native has not been built


 Key: HADOOP-7548
 URL: https://issues.apache.org/jira/browse/HADOOP-7548
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.205.0
Reporter: Eli Collins
 Fix For: 0.20.205.0


The binary target on branch-0.20-security fails with the following, it 
assumes the native dir exists.

BUILD FAILED
/home/eli/src/hadoop-branch-0.20-security/build.xml:1572: 
/home/eli/src/hadoop-branch-0.20-security/build/hadoop-0.20.206.0-SNAPSHOT/native
 not found.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7551) LocalDirAllocator should incorporate LocalStorage

2011-08-17 Thread Eli Collins (JIRA)
LocalDirAllocator should incorporate LocalStorage
-

 Key: HADOOP-7551
 URL: https://issues.apache.org/jira/browse/HADOOP-7551
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.20.204.0
Reporter: Eli Collins


The o.a.h.fs.LocalDirAllocator is not aware of o.a.h.m.t.LocalStorage 
(introduced in MAPREDUCE-2413) - it always considers the configured local dirs, 
not just the ones that happen to be good. Therefore if there's a disk failure 
then *every* call to get a local path will result in 
LocalDirAllocator#confChanged doing a disk check of *all* the configured local 
dirs. It seems like LocalStorage should be a private class to LocalAllocator so 
that all users of LocalDirAllocator benefit from the disk failure handling and 
all the various users of LocalDirAllocator don't have to be modified to handle 
disk failures. Note that LocalDirAllocator already handles faulty directories.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7552) FileUtil#fullyDelete doesn't throw IOE but lists it in the throws clause

2011-08-17 Thread Eli Collins (JIRA)
FileUtil#fullyDelete doesn't throw IOE but lists it in the throws clause


 Key: HADOOP-7552
 URL: https://issues.apache.org/jira/browse/HADOOP-7552
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 0.23.0


FileUtil#fullyDelete doesn't throw IOException so it shouldn't have IOException 
in its throws clause. Having it listed makes it easy to think you'll get an 
IOException eg trying to delete a non-existant file or on an IO error accessing 
the local file, but you don't.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6423) Add unit tests framework (Mockito)

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6423.
-

Resolution: Fixed

mockito was added

 Add unit tests framework (Mockito)
 --

 Key: HADOOP-6423
 URL: https://issues.apache.org/jira/browse/HADOOP-6423
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.21.0, 0.22.0
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik

 Common tests are functional tests or end to end. It makes sense to have 
 Mockito framework for the convenience of true unit tests development. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6503) contrib projects should pull in the ivy-fetched libs from the root project

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6503.
-

Resolution: Won't Fix

NA post mavenization

 contrib projects should pull in the ivy-fetched libs from the root project
 --

 Key: HADOOP-6503
 URL: https://issues.apache.org/jira/browse/HADOOP-6503
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-6503-branch-0.20.txt


 On branch-20 currently, I get an error just running ant contrib 
 -Dtestcase=TestHdfsProxy. In a full ant test build sometimes this doesn't 
 appear to be an issue. The problem is that the contrib projects don't 
 automatically pull in the dependencies of the Hadoop ivy project. Thus, 
 they each have to declare all of the common dependencies like commons-cli, 
 etc. Some are missing and this causes test failures.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6606) Change the default HADOOP_PID_DIR to $HADOOP_HOME/pids

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6606.
-

Resolution: Won't Fix

 Change the default HADOOP_PID_DIR to $HADOOP_HOME/pids
 --

 Key: HADOOP-6606
 URL: https://issues.apache.org/jira/browse/HADOOP-6606
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.20.2
Reporter: Chad Metcalf
Assignee: Chad Metcalf
 Attachments: HADOOP-6606.patch


 /tmp should not be used as a pid directory. There is too high a likelihood 
 that pid files could be altered or deleted. A more reasonable default is 
 $HADOOP_HOME/pids. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6604) Improve HADOOP_HOME detection and handling in hadoop-config

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6604.
-

Resolution: Won't Fix

Out of date

 Improve HADOOP_HOME detection and handling in hadoop-config
 ---

 Key: HADOOP-6604
 URL: https://issues.apache.org/jira/browse/HADOOP-6604
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.20.2
Reporter: Chad Metcalf
Assignee: Chad Metcalf
Priority: Minor
 Attachments: HADOOP-6604.patch


 If HADOOP_HOME is not set by the time hadoop-config is sourced we should try 
 to guess it. Generally speaking hadoop-config.sh is in bin and HADOOP_HOME 
 should generally be .. from there. Additionally we can do some verification 
 of this guess by looking for jars in HADOOP_HOME.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6880) Documentation: Chinese (cn) doc is old

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6880.
-

Resolution: Fixed

cn docs were removed

 Documentation: Chinese (cn) doc is old
 --

 Key: HADOOP-6880
 URL: https://issues.apache.org/jira/browse/HADOOP-6880
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.2
Reporter: huang jian

 Documentation: Chinese (cn) doc is old.
 http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html
 JavaTM 1.6.x, preferably from Sun, must be installed. 
 Configuration
 Use the following:
 conf/core-site.xml:
 configuration
   property
 namefs.default.name/name
 valuehdfs://localhost:9000/value
   /property
 /configuration
 conf/hdfs-site.xml:
 configuration
   property
 namedfs.replication/name
 value1/value
   /property
 /configuration
 conf/mapred-site.xml:
 configuration
   property
 namemapred.job.tracker/name
 valuelocalhost:9001/value
   /property
 /configuration
 http://hadoop.apache.org/common/docs/r0.20.2/cn/quickstart.html
 JavaTM1.5.x,必须安装,建议选择Sun公司发行的Java版本。 
 配置
 使用如下的 conf/hadoop-site.xml:
 configuration
   property
 namefs.default.name/name
 valuelocalhost:9000/value
   /property
   property
 namemapred.job.tracker/name
 valuelocalhost:9001/value
   /property
   property
 namedfs.replication/name
 value1/value
   /property
 /configuration

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6247) move the hdfs and mapred scripts to their respective subprojects

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6247.
-

Resolution: Fixed

This was fixed

 move the hdfs and mapred scripts to their respective subprojects
 

 Key: HADOOP-6247
 URL: https://issues.apache.org/jira/browse/HADOOP-6247
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Owen O'Malley

 The scripts for mapred and hdfs are currently in common, but they need to be 
 moved to their respective subprojects.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6191) Allow Super user access only from certian trusted IP Range- This is to avoid spoofing by others as super user and gain access to the cluster

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6191.
-

Resolution: Won't Fix

 Allow Super user access only from certian trusted IP Range- This is to avoid 
 spoofing by others as super user and gain access to the cluster
 

 Key: HADOOP-6191
 URL: https://issues.apache.org/jira/browse/HADOOP-6191
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.18.2
Reporter: Pallavi Palleti
Priority: Minor
 Attachments: Hadoop-6191.patch, commons-net-ftp-2.0.jar


 This is similar to https://issues.apache.org/jira/browse/HADOOP-6187 but for 
 18.2 version

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6159) Remove core from build.xml

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6159.
-

Resolution: Won't Fix

NA post mavenization

 Remove core from build.xml
 

 Key: HADOOP-6159
 URL: https://issues.apache.org/jira/browse/HADOOP-6159
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Tsz Wo (Nicholas), SZE

 build.xml still creating core directories and other stuffs such as
 {code}
   property name=test.core.build.classes 
 value=${test.build.dir}/core/classes/
 ...
   property name=jdiff.stable.javadoc 
 value=http://hadoop.apache.org/core/docs/r${jdiff.stable}/api//
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6110) test-patch takes 45min!

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6110.
-

Resolution: Fixed

Runs quickly now

 test-patch takes 45min!
 ---

 Key: HADOOP-6110
 URL: https://issues.apache.org/jira/browse/HADOOP-6110
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, test
Reporter: Amar Kamat
Assignee: Giridharan Kesavan
Priority: Critical

 The runtime of test-patch has increased to 45min!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6092) No space left on device

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6092.
-

Resolution: Duplicate

Already have jiras tracking this, see last comment.

 No space left on device
 ---

 Key: HADOOP-6092
 URL: https://issues.apache.org/jira/browse/HADOOP-6092
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.19.0
 Environment: ubuntu0.8.4
Reporter: mawanqiang

 Exception in thread main org.apache.hadoop.fs.FSError: java.io.IOException: 
 No space left on device
 at 
 org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:199)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at java.io.FilterOutputStream.close(FilterOutputStream.java:140)
 at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
 at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
 at 
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:339)
 at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
 at 
 org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:825)
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
 at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
 at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
 Caused by: java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:260)
 at 
 org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:197)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-6035) jobtracker stops when namenode goes out of safemode runing capacit scheduler

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-6035.
-

Resolution: Invalid

Should be moved to the list.

 jobtracker stops when namenode goes out of safemode runing capacit scheduler
 

 Key: HADOOP-6035
 URL: https://issues.apache.org/jira/browse/HADOOP-6035
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.0
 Environment: Fedora 10
Reporter: Anjali M
Priority: Minor
 Attachments: capacity-scheduler.xml, 
 hadoop-hadoop-jobtracker-anjus.in.log, 
 hadoop-hadoop-tasktracker-anjus.in.log.2009-06-24, hadoop-site.xml


 I am facing a problem running the capacity scheduler in hadoop-0.20.0.
 The jobtracker is listing the queues when namenode is in the safemode.
 Once the namenode goes out of the safemode the jt stops working. On
 accessing jobqueue details it shows the following error.
 HTTP ERROR: 500
 INTERNAL_SERVER_ERROR
 RequestURI=/jobqueue_details.jsp
 Caused by:
 java.lang.NullPointerException
at 
 org.apache.hadoop.mapred.JobQueuesManager.getRunningJobQueue(JobQueuesManager.java:156)
at 
 org.apache.hadoop.mapred.CapacityTaskScheduler.getJobs(CapacityTaskScheduler.java:1495)
at 
 org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:64)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
 Is it because any of the configuration in capacity-scheduler.xml is incorrect?
 I tried forcing the namenode out of the safemode in bin/hadoop
 dfsadmin, but still it does not work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-5405) Adding support for HDFS proxy

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-5405.
-

Resolution: Won't Fix

hdfs proxy was removed.

 Adding support for HDFS proxy
 -

 Key: HADOOP-5405
 URL: https://issues.apache.org/jira/browse/HADOOP-5405
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kan Zhang
Assignee: Kan Zhang

 Currently, HDFS doesn't authenticate users. The client simply tells HDFS who 
 she is and HDFS will take her word for it. That makes it possible for HDFS 
 proxy to access HDFS on behalf of a user - the proxy simply claims to be that 
 user. Once we turn on authentication, the proxy can't do that without having 
 user's credentials. We need a solution such that HDFS proxy can continue to 
 access HDFS on behalf of users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-5615) Spec file and SRPM for building a Hadoop-0.19.1 RPM

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-5615.
-

Resolution: Won't Fix

 Spec file and SRPM for building a Hadoop-0.19.1 RPM
 ---

 Key: HADOOP-5615
 URL: https://issues.apache.org/jira/browse/HADOOP-5615
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.19.1
Reporter: Ian Soboroff
Priority: Minor
 Attachments: hadoop-fuse.spec, hadoop.spec, hadoop.spec

   Original Estimate: 0h
  Remaining Estimate: 0h

 I like the idea of Cloudera's RPMs, in that packages are convenient and the 
 boot scripts are very handy, but they are for a patched 0.18.3 and include 
 other stuff.  Here I offer a spec file for 0.19.1 without extra cruft.  It is 
 essentially the spec from Cloudera's RPM with suitable edits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-5263) Documentation: Chinese (cn) doc structure placed in the middle of the English doc structure

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-5263.
-

Resolution: Won't Fix

cn docs were removed

 Documentation: Chinese (cn) doc structure placed in the middle of the English 
 doc structure
 ---

 Key: HADOOP-5263
 URL: https://issues.apache.org/jira/browse/HADOOP-5263
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, documentation
Affects Versions: 0.19.0
Reporter: Corinne Chandel
Priority: Minor

 The Chinese doc structure was plopped into the middle of the English doc 
 structure. You need to figure out where you are going to put translations of 
 the Hadoop core docs.
 [-] src
  [-] docs -- English docs
   [+] .svn
   [+] build
 changes
   [+] cn  - Chinese docs
   [+] src

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-5256) Some tests not run by default

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-5256.
-

Resolution: Not A Problem

 Some tests not run by default
 -

 Key: HADOOP-5256
 URL: https://issues.apache.org/jira/browse/HADOOP-5256
 Project: Hadoop Common
  Issue Type: Bug
  Components: build, test
Reporter: Nigel Daley
Assignee: gary murry

 To be run by the 'test' target, test file names must start with Test and 
 end in .java.  One example that violates this is 
 src/test/org/apache/hadoop/fs/FileSystemContractBaseTest.java.  Is this on 
 purpose to that it's not run automatically?  Are there other tests like this?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4819) 0.19.1 will not build under Solaris 5.10 (x86)

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4819.
-

Resolution: Duplicate

Another jira tracks solaris compilation.

 0.19.1 will not build under Solaris 5.10 (x86)
 --

 Key: HADOOP-4819
 URL: https://issues.apache.org/jira/browse/HADOOP-4819
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.19.1
 Environment: SunOS dcache09 5.10 Generic_127128-11 i86pc i386 i86pc
 SunFire X4500 Thumper
Reporter: Carl Lundstedt

 I checked out branch-0.19 from svn and attempted a build.
  ant -Dcompile.native=true -Dnonspace.os=SunOS -Dmake.cmd=gmake clean tar
  [exec] gmake[2]: Entering directory 
 `/opt/hadoop/hadoop-0.19.1/branch-0.19/build/native/SunOS-x86-32/src/org/apache/hadoop/io/compress/lzo'
  [exec] if /bin/sh ../../../../../../../libtool --tag=CC --mode=compile 
 gcc -DHAVE_CONFIG_H -I. 
 -I/opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hadoop/io/compress/lzo
  -I../../../../../../..  -I/opt/SDK/jdk//include 
 -I/opt/SDK/jdk//include/solaris 
 -I/opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src  -g -Wall -fPIC -O2 
 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c -o 
 LzoCompressor.lo 
 /opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c;
  \
  [exec] then mv -f .deps/LzoCompressor.Tpo 
 .deps/LzoCompressor.Plo; else rm -f .deps/LzoCompressor.Tpo; exit 1; fi
  [exec] mkdir .libs
  [exec]  gcc -DHAVE_CONFIG_H -I. 
 -I/opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hadoop/io/compress/lzo
  -I../../../../../../.. -I/opt/SDK/jdk//include 
 -I/opt/SDK/jdk//include/solarop/hadoop-0.19.1/branch-0.19/src/native/src -g 
 -Wall -fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF 
 .deps/LzoCompressor.Tpo -c 
 /opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hads/lzo/LzoCompressor.c
   -fPIC -DPIC -o .libs/LzoCompressor.o
  [exec] 
 /opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:
  In function `Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
  [exec] 
 /opt/hadoop/hadoop-0.19.1/branch-0.19/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:137:
  error: syntax error before ',' token
  [exec] gmake[2]: *** [LzoCompressor.lo] Error 1
  [exec] gmake[2]: Leaving directory 
 `/opt/hadoop/hadoop-0.19.1/branch-0.19/build/native/SunOS-x86-32/src/org/apache/hadoop/io/compress/lzo'
  [exec] gmake[1]: *** [all-recursive] Error 1
  [exec] gmake[1]: Leaving directory 
 `/opt/hadoop/hadoop-0.19.1/branch-0.19/build/native/SunOS-x86-32'
  [exec] gmake: *** [all] Error 2
 Not sure what the cause or fix may be.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-5010) Document HTTP/HTTPS methods to read directory and file data

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-5010.
-

Resolution: Fixed

HDFS proxy was removed, hftp is documented.

 Document HTTP/HTTPS methods to read directory and file data
 ---

 Key: HADOOP-5010
 URL: https://issues.apache.org/jira/browse/HADOOP-5010
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.18.0
Reporter: Marco Nicosia
Priority: Trivial
 Attachments: 5010-0.patch


 In HADOOP-1563, [~cutting] wrote:
 bq. The URI for this should be something like hftp://host:port/a/b/c, since, 
 while HTTP will be used as the transport, this will not be a FileSystem for 
 arbitrary HTTP urls.
 Recently, we've been talking about implementing an HDFS proxy (HADOOP-4575) 
 which would be a secure way to make HFTP/HSFTP available. In so doing, we may 
 even remove HFTP/HSFTP from being offered on the HDFS itself (that's another 
 discussion).
 In the case of the HDFS proxy, does it make sense to do away with the 
 artificial HFTP/HSFTP protocols, and instead simply offer standard HTTP and 
 HTTPS? That would allow non-HDFS-specific clients, as well as using various 
 standard HTTP infrastructure, such as load balancers, etc.
 NB, to the best of my knowledge, HFTP is only documented on the 
 [distcp|http://hadoop.apache.org/core/docs/current/distcp.html] page, and 
 HSFTP is not documented at all?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4385) Build fails for Mac OSX

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4385.
-

Resolution: Fixed

Fixed elsewhere.

 Build fails for Mac OSX
 ---

 Key: HADOOP-4385
 URL: https://issues.apache.org/jira/browse/HADOOP-4385
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.18.1
 Environment: Mac OSX 10+
Reporter: Patrick Winters
 Attachments: mac-fuse-hdfs.patch


 Automatic build fails on Mac OS X, due to linking errors with the JVM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4484) 0.18.1 breaks SOCKS server setting

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4484.
-

Resolution: Won't Fix

Out of date

 0.18.1 breaks SOCKS server setting
 --

 Key: HADOOP-4484
 URL: https://issues.apache.org/jira/browse/HADOOP-4484
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 0.18.1
Reporter: Eugene Hung
Priority: Minor

 Our cluster is behind a gateway, and we asked users to use 
 hadoop.socks.server to access the cluster.  
 Recently, we upgraded our cluster from 0.17.2 to 0.18.1.  However, this 
 creates the following problem:
 Wrote input for Map #4
 Starting Job
 java.net.UnknownHostException: unknown host: hadoop-master
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:195)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:779)
 at org.apache.hadoop.ipc.Client.call(Client.java:704)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
 at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
 Nothing else has changed and we were able to duplicate this error on a 
 separate cluster setup, so we
 believe 0.18.1 has broken this feature.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4748) Chinese Translation of Hadoop-Related Documents

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4748.
-

Resolution: Won't Fix

Out of date

 Chinese Translation of Hadoop-Related Documents
 ---

 Key: HADOOP-4748
 URL: https://issues.apache.org/jira/browse/HADOOP-4748
 Project: Hadoop Common
  Issue Type: New Feature
  Components: documentation
Reporter: He Yongqiang
Priority: Minor
 Attachments: hadoopwiki.rar


 Translate hadoop related documents, including javadoc, tutorial, programer 
 guide, technical analyses etx, into chinese language.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4382) Run Hadoop sort benchmark on Amazon EC2

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4382.
-

Resolution: Won't Fix

Covered by apache whirr

 Run Hadoop sort benchmark on Amazon EC2
 ---

 Key: HADOOP-4382
 URL: https://issues.apache.org/jira/browse/HADOOP-4382
 Project: Hadoop Common
  Issue Type: Test
  Components: contrib/cloud
Reporter: Tom White
Assignee: Tom White
 Attachments: hadoop-4382-v2.patch, hadoop-4382.patch


 By running a benchmark on EC2 we can see how well Hadoop performs, how to 
 tune it, and how performance changes between releases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4042) bin/hadoop should check `which java` to find java

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4042.
-

Resolution: Won't Fix

Per other jira we only check JAVA_HOME (and java_home on osx)

 bin/hadoop should check `which java` to find java
 -

 Key: HADOOP-4042
 URL: https://issues.apache.org/jira/browse/HADOOP-4042
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.18.0
Reporter: Michael Bieniosek
Priority: Minor

 Currently, the bin/hadoop script tries to find java in JAVA_HOME/bin/java.  
 If JAVA_HOME is not set, it errors out.
 Instead, I think it should check `which java 2/dev/null` to see if there is 
 a java on the user's PATH.  If a java is on the user's path, the script 
 should just set JAVA=java.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-4022) mapper output truncated on pseudo-distributed cluster

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-4022.
-

Resolution: Won't Fix

Out of date

 mapper output truncated on pseudo-distributed cluster
 -

 Key: HADOOP-4022
 URL: https://issues.apache.org/jira/browse/HADOOP-4022
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.17.0, 0.18.0
Reporter: Karl Anderson

 On a pseudo-distributed test run, I'm seeing truncated mapper output.  I 
 don't see this when running the same job on a real cluster managed with 
 hadoop-ec2.
 With a no-reducers streaming job, I'm getting several output files, each 
 truncated to 325.24 KB, and several zero-length output files.  With cat as 
 my reducer, I get one output file, again truncated to 325.24 KB.  This only 
 happens in a pseudo-distributed Hadoop run - when I run the mapper on the 
 command line (cat input_file | ./mapper.py | sort), I get the full output.  
 Input splitting isn't a factor, my input file is small enough to fit in one 
 input split (and truncated input would just break the mapper, not truncate 
 its output).
 My mapper is outputting the default key-value lines for streaming, with a tab 
 separating the key and the value, and no newlines or tabs in the value.
 Truncation happens in the middle of a line.  My lines are very long, the 
 lines themselves are over 325.24 KB.  This isn't the most efficient use of 
 Hadoop, I know, I'm just putting a job in my pipeline while I work on a 
 better implementation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3978) Using regular desktops as hadoop slaves

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3978.
-

Resolution: Won't Fix

Out of date

 Using regular desktops as hadoop slaves
 ---

 Key: HADOOP-3978
 URL: https://issues.apache.org/jira/browse/HADOOP-3978
 Project: Hadoop Common
  Issue Type: Wish
 Environment: 1) Windows XP/Vista
 2) MAC OS/X, Linux
Reporter: Niels Basjes
Priority: Minor

 In many companies there are a lot of desktop systems (usually XP) running all 
 day without doing really much work.
 The basic idea of this wish is similar to what SETI@home did in a screen 
 saver: Turn the idle time of many desktops into a lot of hadoop slaves.
 My wish is that an easy to deploy package (MSI package?) is created that 
 turns an XP system into a hadoop slave.
 The primary task of such a system is being a user desktop. So the hadoop 
 software on such a slave will have restrictions on how many system resources 
 (memory, cpu, network) it can use.
 Having such a feature will make it viable for small companies to start using 
 hadoop for tasks and experiments.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3929) I would like to improve the archive tool [see issue 3307].

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3929.
-

Resolution: Won't Fix

Out of date

 I would like to improve the archive tool [see issue 3307].
 --

 Key: HADOOP-3929
 URL: https://issues.apache.org/jira/browse/HADOOP-3929
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Dick King
   Original Estimate: 504h
  Remaining Estimate: 504h

 I have a tool written atop the libhdfs library that implements an archive 
 system.  It's working [in C++]
 JIRA #3307 documents a native DFS archive system, first available in 18.0 .
 I would like to port my code, and thereby extend that system in 3 directions:
 1: archives will be immutable in 18.0 .  I would like to provide an API to 
 let you add, delete, and modify files.
1a: You would want to be able to batch such operations and perform them 
 all at once when a batch is complete.
 2: the tree to be archived must be in dfs in 18.0 .  I would like it to be 
 possible for the tree to contain some local filesystem files as well [think 
 org.apache.hadoop.fs.Path ]
2a: I realize that this would preclude parallel modification when a local 
 filesystem is used
2b: I don't have a convincing story re two processes simultaneously 
 modifying the same archive, even for a disjoint set of files, but I'm willing 
 to discuss this.
 3: i would like it to be possible to batch the changes and make them all in 
 one operation, to reduce DFS activity.
 I had in-person discussions on this with user mahadev .  He is encouraging me 
 to file this bug report so we can broaden this discussion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3871) add anchors to top of job tracker web page

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3871.
-

Resolution: Duplicate

Covered by MR2.

 add anchors to top of job tracker web page
 --

 Key: HADOOP-3871
 URL: https://issues.apache.org/jira/browse/HADOOP-3871
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: craig weisenfluh

 add anchors to job tracker webpage that allow for rapid navigation to 
 running, completed, and failed jobs

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3870) show job priority on first page

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3870.
-

Resolution: Duplicate

Covered by MR2.

 show job priority on first page
 ---

 Key: HADOOP-3870
 URL: https://issues.apache.org/jira/browse/HADOOP-3870
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: craig weisenfluh

 to allow for filtering of job priority and for changing job priority via the 
 jobtracker first page (web page), the job priority needs to be added to the 
 running jobs table.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3869) provide web interface to filter jobtracker jobs

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3869.
-

Resolution: Duplicate

Covered by MR2.

 provide web interface to filter jobtracker jobs
 ---

 Key: HADOOP-3869
 URL: https://issues.apache.org/jira/browse/HADOOP-3869
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: craig weisenfluh

 add search functionality that will allow for searching jobs by user, job id, 
 priority, or job name. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3867) provide interface to jobtracker admin to kill or reprioritize one or more jobs

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3867.
-

Resolution: Duplicate

Covered by MR2.

 provide interface to jobtracker admin to kill or reprioritize one or more jobs
 --

 Key: HADOOP-3867
 URL: https://issues.apache.org/jira/browse/HADOOP-3867
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: craig weisenfluh

 when a large number of jobs are present, killing or re-prioritizing is not 
 straightforward.   An interface consisting of method to select and then apply 
 an action to a group of jobs would make such modifications more manageable 
 when dealing with a lot of jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3818) Not possible to access a FileSystem from within a ShutdownHook

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3818.
-

Resolution: Won't Fix

Out of date

 Not possible to access a FileSystem from within a ShutdownHook
 --

 Key: HADOOP-3818
 URL: https://issues.apache.org/jira/browse/HADOOP-3818
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.17.1
Reporter: Rowan Nairn
Priority: Minor

 FileSystem uses addShutdownHook to close all FileSystems at exit.  This makes 
 it impossible to access a FileSystem from within your own ShutdownHook 
 threads, say for deleting incomplete output.  Using a pre-existing FileSystem 
 object is unsafe since it may be closed by the time the thread executes.  
 Using FileSystem.get(...) results in an exception:
 Exception in thread Thread-10 java.lang.IllegalStateException: Shutdown in 
 progress
   at java.lang.Shutdown.add(Shutdown.java:81)
   at java.lang.Runtime.addShutdownHook(Runtime.java:190)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1293)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:203)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:108)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3568) Don't need to use toString() on strings (code cleanup)

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3568.
-

Resolution: Won't Fix

No pointer/patch

 Don't need to use toString() on strings (code cleanup)
 --

 Key: HADOOP-3568
 URL: https://issues.apache.org/jira/browse/HADOOP-3568
 Project: Hadoop Common
  Issue Type: Improvement
  Components: test
Affects Versions: 0.17.0
Reporter: Tim Halloran
Priority: Minor

 Don't need to call toString on a String type.  This occurs in several places 
 in the test code.  Patches below:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3482) Think to some modification to the API

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3482.
-

Resolution: Won't Fix

Out of date

 Think to some modification to the API
 -

 Key: HADOOP-3482
 URL: https://issues.apache.org/jira/browse/HADOOP-3482
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Brice Arnould
Priority: Minor

 I know that it is out of question to break everything just for minor comfort 
 improvements.
 But I think that we should evaluate the cost of improving the API, and the 
 way we might do this without too much breaks, before the API freeze of 1.0 . 
 If we start now, it could be a gradual process that would end only with the 
 1.0 release.
 I do not particulary focus on those modifications, but here is a samples of 
 what questions I would like to raise.
 # Text could provide a append(Text other) method
 # Text could implements CharSequence
 # Some Iterators could be turned into IterableIterators.
 # We could consider a more consistent naming (Text could be renamed 
 TextWritable, JobConfigurable.configure() could be turn in 
 JobConfigurable.setConf() to match Configurable and so on)
 1. and 2. seems imediatly accessible, even if 1. would benefit of the use of 
 a resizable container to store bytes.
 3. requires in theory only the add of a new IterableIteror class inheriting 
 of Iterator. But making that change visible to the user would require a 
 change in some interface like Mapper and Reducer.
 4. *Configurable:* If those Interfaces are only used in the ReflectionUtils, 
 it would be enough to deprecate JobConfigurable and to consider it as a 
 particular case of Configurable. Another option would be to deprecate both 
 and to add support in newInstance for passing parameters to constructors.
 *Text:* Text could be an empty class inheriting of TextWritable, in a package 
 like org.hadoop.compat .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3554) LineRecordReader needs more synchronization

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3554.
-

Resolution: Won't Fix

Out of date

 LineRecordReader needs more synchronization
 ---

 Key: HADOOP-3554
 URL: https://issues.apache.org/jira/browse/HADOOP-3554
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.17.0
 Environment: All java platforms
Reporter: Aaron Greenhouse
 Attachments: HADOOP-3445.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 LineRecordReader has three index fields start, end, and pos.  All of these 
 fields are long, which means that, in general, access to them is not atomic.  
 This can cause problems if the fields are accessed without appropriate 
 synchronization.  
 I propose the following changes to the class:
 - Make the fields start and end final.  This requires some minor changes to 
 the constructor LineRecordReader(Configuration, FileSplit).
 - Make the method getProgress() synchronized.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3555) Add more synchronization to JobStatus

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3555.
-

Resolution: Won't Fix

Out of date

 Add more synchronization to JobStatus
 -

 Key: HADOOP-3555
 URL: https://issues.apache.org/jira/browse/HADOOP-3555
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.17.0
Reporter: Aaron Greenhouse
   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The methods getJobId(), readFields(), and write() need to be made 
 synchronized.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-3473) io.sort.factor should default to 100 instead of 10

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-3473.
-

Resolution: Fixed

 io.sort.factor should default to 100 instead of 10
 --

 Key: HADOOP-3473
 URL: https://issues.apache.org/jira/browse/HADOOP-3473
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 10 is *really* conservative and can make merges much much more expensive.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-2988) Trash not being deleted on time

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-2988.
-

Resolution: Won't Fix

Out of date

 Trash not being deleted on time
 ---

 Key: HADOOP-2988
 URL: https://issues.apache.org/jira/browse/HADOOP-2988
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.14.3
Reporter: Koji Noguchi
Priority: Trivial

 On one of our cluster, we set the Trash interval to 6 hrs. 
 However, sometimes the namenode doesn't delete the Trash dir on time.
 -bash-3.00$ hadoop dfs -ls /Trash
 Found 3 items
 /Trash/0711201201   dir   2007-11-20 06:00
 /Trash/0711201800   dir   2007-11-20 12:15
 /Trash/Current  dir   2007-11-20 18:01
 In our current setting, we're supposed to have only one 'current' and one 
 previous snapshot.  
 Grepping shows that /Trash/0711201201 was not even touched.
 My guess is that 1800 - 1201 = 5 hrs 59min   6hrs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-2474) Automate EC2 DynDNS setup

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-2474.
-

Resolution: Won't Fix

Covered by whirr

 Automate EC2 DynDNS setup
 -

 Key: HADOOP-2474
 URL: https://issues.apache.org/jira/browse/HADOOP-2474
 Project: Hadoop Common
  Issue Type: Improvement
  Components: contrib/cloud
Reporter: Tom White

 Use the DynDNS webservice 
 (https://www.dyndns.com/developers/specs/syntax.html) to automatically set up 
 DNS for the EC2 cluster master node. If no DynDNS credentials are set (in 
 hadoop-ec2-env.sh) then prompt the user to set up DNS manually (the existing 
 behaviour).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-2678) Hadoop-Nightly does not run contrib tests if core tests fail

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-2678.
-

Resolution: Won't Fix

No more contrib

 Hadoop-Nightly does not run contrib tests if core tests fail
 

 Key: HADOOP-2678
 URL: https://issues.apache.org/jira/browse/HADOOP-2678
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Jim Kellerman
Assignee: Nigel Daley

 Unlike Hadoop-Patch, Hadoop-Nightly does not run the contrib tests if the 
 core tests fail. Even if the core tests fail, it is useful to know if there 
 has been a regression in contrib.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-2305) test-core tests have increased in elapsed time

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-2305.
-

Resolution: Won't Fix

Out of date

 test-core tests have increased in elapsed time
 --

 Key: HADOOP-2305
 URL: https://issues.apache.org/jira/browse/HADOOP-2305
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.16.0
Reporter: Jim Kellerman

 test-core test cases now take over an hour to execute.
 I don't know which tests are taking more time, but this seems like a large 
 increase over test-core execution in the recent past.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1787) Reorder CHANGES.txt so that each section is sorted by bug number

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1787.
-

Resolution: Won't Fix

Out of date

 Reorder CHANGES.txt so that each section is sorted by bug number
 

 Key: HADOOP-1787
 URL: https://issues.apache.org/jira/browse/HADOOP-1787
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley

 Since we've given up the total order on the change log descriptions, I 
 propose that we reorder each section in CHANGES.txt  to be sorted by JIRA 
 number.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1867) use single parameter to specify a node's available ram

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1867.
-

Resolution: Won't Fix

Out of date

 use single parameter to specify a node's available ram
 --

 Key: HADOOP-1867
 URL: https://issues.apache.org/jira/browse/HADOOP-1867
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Doug Cutting

 To simplify configuration, we should use a single parameter to indicate a 
 node's available RAM.  Sites should not need to adjust more than this single 
 parameter to configure nodes available memory.  In task JVMs, some 
 significant percentage of the memory should be reserved for application code, 
 with the remainder divided among various system buffers.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1267) change default config to be single node rather than local for both map/reduce and hdfs

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1267.
-

Resolution: Won't Fix

Out of date

 change default config to be single node rather than local for both 
 map/reduce and hdfs
 

 Key: HADOOP-1267
 URL: https://issues.apache.org/jira/browse/HADOOP-1267
 Project: Hadoop Common
  Issue Type: Task
  Components: conf
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 I propose that we change the default config to be set up for a single node 
 rather than the current local, which uses direct file access and the local 
 job runner.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1525) Java Service Wrapper for Hadoop

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1525.
-

Resolution: Won't Fix

Out of date

 Java Service Wrapper for Hadoop
 ---

 Key: HADOOP-1525
 URL: https://issues.apache.org/jira/browse/HADOOP-1525
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 0.14.0
Reporter: Albert Strasheim
 Attachments: InstallServices.bat, UninstallServices.bat, 
 datanode-wrapper.conf, datanode.bat, namenode-wrapper.conf, namenode.bat, 
 secondarynamenode-wrapper.conf, secondarynamenode.bat


 I'm attaching Java Service Wrapper configuration files for namenode, datanode 
 and secondarynamenode to this issue. These should probably go in the conf/ 
 directory.
 I'm also attaching batch files to run these services in the console that can 
 be used to check that the configuration works. These should probably go in 
 the bin/ directory.
 By setting wrapper.app.parameter.2=-format the first time namenode.bat is 
 run, the user do the necessary format.
 Apache ActiveMQ includes the Java Service Wrapper binaries in the tarball 
 they ship, so doing this for Hadoop seems feasible.
 More about Java Service Wrapper:
 http://wrapper.tanukisoftware.org/doc/english/introduction.html
 P. S.  It seems the ordering of the classpath is very important.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-953) huge log files

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-953.


Resolution: Fixed

 huge log files
 --

 Key: HADOOP-953
 URL: https://issues.apache.org/jira/browse/HADOOP-953
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.10.1
 Environment: N/A
Reporter: Andrew McNabb

 On our system, it's not uncommon to get 20 MB of logs with each MapReduce 
 job.  It would be very helpful if it were possible to configure Hadoop 
 daemons to write logs only when major things happen, but the only conf 
 options I could find are for increasing the amount of output.  The disk is 
 really a bottleneck for us, and I believe that short jobs would run much more 
 quickly with less disk usage.  We also believe that the high disk usage might 
 be triggering a kernel bug on some of our machines, causing them to crash.  
 If the 20 MB of logs went down to 20 KB, we would probably still have all of 
 the information we needed.
 Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-601) we need some rpc retry framework

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-601.


Resolution: Fixed

 we need some rpc retry framework
 

 Key: HADOOP-601
 URL: https://issues.apache.org/jira/browse/HADOOP-601
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.11.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HADOOP-601-v1.patch, HADOOP-601-v1.patch


 We need some mechanism for RPC calls that get exceptions to automatically 
 retry the call under certain circumstances. In particular, we often end up 
 with calls to rpcs being wrapped with retry loops for timeouts. We should be 
 able to make a retrying proxy that will call the rpc and retry in some 
 circumstances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1899) the metrics system in the job tracker is running too often

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1899.
-

Resolution: Fixed

 the metrics system in the job tracker is running too often
 --

 Key: HADOOP-1899
 URL: https://issues.apache.org/jira/browse/HADOOP-1899
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 The metrics system in the JobTracker is defaulting to every 5 seconds 
 computing all of the counters for all of the jobs. This work is a substantial 
 amount of work showing up as running in 20% of the snapshots that I've seen. 
 I'd like to lower the default interval to once every 60 seconds and make it a 
 low priority thread.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-1072) VersionMismatch should be VersionMismatchException

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-1072.
-

Resolution: Won't Fix

Already an IO class w that name.

 VersionMismatch should be VersionMismatchException
 --

 Key: HADOOP-1072
 URL: https://issues.apache.org/jira/browse/HADOOP-1072
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.11.2
Reporter: Nigel Daley
Assignee: Rajagopal Natarajan
Priority: Minor
 Attachments: 1072.patch


 org.apache.hadoop.ipc.RPC$VersionMismatch extends IOException.  It's name 
 should follow the Java naming convention for Exceptions, and thus be 
 VersionMismatchException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-502) Summer buffer overflow exception

2011-08-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-502.


Resolution: Won't Fix

Out of date.

 Summer buffer overflow exception
 

 Key: HADOOP-502
 URL: https://issues.apache.org/jira/browse/HADOOP-502
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.5.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 The extended error message with the offending values finally paid off and I 
 was able to get the values that were causing the Summber buffer overflow 
 exception.
 java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, 
 summed=512, read=2880, bytesPerSum=1, inSum=512
 at 
 org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
 at 
 org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
 at java.io.DataInputStream.read(DataInputStream.java:80)
 at 
 org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
 at 
 org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
 at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
 Caused by: java.lang.ArrayIndexOutOfBoundsException
 at java.util.zip.CRC32.update(CRC32.java:43)
 at 
 org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
 ... 9 more
 Tracking through the code, what happens is inside of 
 FSDataInputStream.Checker.read() the verifySum gets an  EOF Exception and 
 turns off the summing. Among other things this sets the bytesPerSum to 1. 
 Unfortunately, that leads to the ArrayIndexOutOfBoundsException.
 I think the problem is that the original EOF exception was logged and 
 ignored. I propose that we allow the original EOF to propagate back to the 
 caller. (So that file not found will still disable the checksum checking, but 
 we will detect truncated checksum files.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7531) Add servlet util methods for handling paths in requests

2011-08-09 Thread Eli Collins (JIRA)
Add servlet util methods for handling paths in requests 


 Key: HADOOP-7531
 URL: https://issues.apache.org/jira/browse/HADOOP-7531
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 0.23.0
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 0.23.0


Common side of HDFS-2235.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7526) Add TestPath tests for URI conversion and reserved characters

2011-08-07 Thread Eli Collins (JIRA)
Add TestPath tests for URI conversion and reserved characters  
---

 Key: HADOOP-7526
 URL: https://issues.apache.org/jira/browse/HADOOP-7526
 Project: Hadoop Common
  Issue Type: Test
  Components: fs
Affects Versions: 0.23.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 0.23.0
 Attachments: hadoop-7526-1.patch

TestPath needs tests that cover URI conversion (eg places where Paths and URIs 
differ) and handling of URI reserved characters in paths. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7527) Make URL encoding consistent

2011-08-07 Thread Eli Collins (JIRA)
Make URL encoding consistent


 Key: HADOOP-7527
 URL: https://issues.apache.org/jira/browse/HADOOP-7527
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Eli Collins


URL encoding is currently handled in at least 4 different ways. We should make 
these consistent:
# Parameters are encoded when a URI object is created
# HttpServlet uses RequestQuoter to html escape parameter names and values
# StringEscapeUtils is used to escape parameters in ReconfigurationServlet and 
DatanodeJspHelper
# URLEncoder and URLDecoder are used in multiple places 

We should also be consistent about how we pass file names in URLs, some times 
they're passed in the path segment, sometimes they're passed in the query 
fragment as parameters.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [VOTE] Release 0.20.204.0-rc0

2011-08-03 Thread Eli Collins
On Wed, Aug 3, 2011 at 2:14 PM, Matt Foley mfo...@hortonworks.com wrote:

 So, how are we doing with 0.20.204 content and trunk given the above
 proposal? Very well, in fact. Matt, Suresh and I have done a detailed
 analysis (separate email), please take a look.


 Here's the analysis of changes in 204 vs trunk.  We believe that ALL the
 changes in 204 are either already in trunk or do not need to be in trunk.
  Here is the list of 204 items not already in trunk, and their
 categorization.

 Please, if anyone thinks we've missed something, bring it to my attention
 and if it isn't already in trunk we will get it into trunk as expeditiously
 as possible.
 Thanks,
 --Matt

Looks good to me Matt. Thanks for doing the analysis.  I think the
regression wrt trunk are from the original cut of the branch and don't
need to block the 204 release, ie the trunk first discussion is IMO an
orthogonal thread.

Thanks,
Eli


[jira] [Created] (HADOOP-7503) Client#getRemotePrincipal NPEs when given invalid dfs.*.name

2011-08-02 Thread Eli Collins (JIRA)
Client#getRemotePrincipal NPEs when given invalid dfs.*.name


 Key: HADOOP-7503
 URL: https://issues.apache.org/jira/browse/HADOOP-7503
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, security
Affects Versions: 0.20.203.0, 0.23.0
Reporter: Eli Collins


The following code in Client#getRemotePrincipal NPEs if security is enabled and 
dfs.https.address, dfs.secondary.http.address, dfs.secondary.https.address, or 
fs.default.name, has an invalid value (eg hdfs://foo.bar.com.foo.bar.com:1000). 
We should check address.checkAddress() for null (or check this earlier)  and 
give a more helpful error message.

{noformat}
  return SecurityUtil.getServerPrincipal(conf.get(serverKey), address
.getAddress().getCanonicalHostName());
{noformat}


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7504) hadoop-metrics.properties missing some Ganglia31 options

2011-08-02 Thread Eli Collins (JIRA)
hadoop-metrics.properties missing some Ganglia31 options 
-

 Key: HADOOP-7504
 URL: https://issues.apache.org/jira/browse/HADOOP-7504
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.20.203.0, 0.23.0
Reporter: Eli Collins
Priority: Trivial


The jvm, rpc, and ugi sections of hadoop-metrics.properties should have 
Ganglia31 options like dfs and mapred

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7505) EOFException in RPC stack should have a nicer error message

2011-08-02 Thread Eli Collins (JIRA)
EOFException in RPC stack should have a nicer error message
---

 Key: HADOOP-7505
 URL: https://issues.apache.org/jira/browse/HADOOP-7505
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.23.0
Reporter: Eli Collins
Priority: Minor


Lots of user logs involve a user running mismatched versions, and for some 
reason or another, they get EOFException instead of a proper version mismatch 
exception. We should be able to catch this at appropriate points, and have a 
nicer exception message explaining that it's a possible version mismatch, or 
that they're trying to connect to the incorrect port.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [VOTE] Release 0.20.204.0-rc0

2011-07-28 Thread Eli Collins
We've done both, even within a branch (0.20.0 was voted on core-dev@
and 0.20.2 on general@).  The bylaws suggest general@ should be used,
which seems to make sense since we're releasing common, hdfs and mr. I
think either works as long as people know where to check.

http://hadoop.apache.org/bylaws.html

Voting
Decisions regarding the project are made by votes on the primary
project development mailing list (gene...@hadoop.apache.org)


On Thu, Jul 28, 2011 at 5:40 PM, Arun C Murthy a...@hortonworks.com wrote:
 In the past we've carried it out on common-dev:
 http://hadoop-common.472056.n3.nabble.com/VOTE-Release-Hadoop-0-21-0-candidate-2-td1181981.html

 Arun

 On Jul 28, 2011, at 5:33 PM, milind.bhandar...@emc.com wrote:

 Somehow I remember that the 0.20.203.0 vote was carried out on general@

 Here is a message from the archive:

 http://mail-archives.apache.org/mod_mbox/hadoop-general/201105.mbox/browser

 - milind

 ---
 Milind Bhandarkar


 -Original Message-
 From: Arun C Murthy [mailto:a...@hortonworks.com]
 Sent: Thursday, July 28, 2011 5:27 PM
 To: common-dev@hadoop.apache.org
 Subject: Re: [VOTE] Release 0.20.204.0-rc0

 Nope. general@ is only for announcements.

 AFAIK Votes are developer activities.

 Arun

 On Jul 28, 2011, at 5:11 PM, Aaron T. Myers wrote:

 Shouldn't this vote be taking place on general@, instead of common-dev@? I'm
 under the impression that that is where all votes are supposed to take
 place. Please correct me if I am wrong about that.

 --
 Aaron T. Myers
 Software Engineer, Cloudera



 On Thu, Jul 28, 2011 at 5:02 PM, Giridharan Kesavan
 gkesa...@yahoo-inc.comwrote:

 This issue is fixed with Eric's patch for HADOOP-7356. Since Owen is out on
 vacation Iam working on getting the release tarball.

 -Giri


 On 7/28/11 1:56 PM, Giridharan Kesavan gkesa...@yahoo-inc.com wrote:

 Myself and Eric Yang are looking into this.
 -Giri

 On 7/28/11 12:04 PM, Allen Wittenauer awittena...@linkedin.com
 wrote:


 On Jul 25, 2011, at 7:05 PM, Owen O'Malley wrote:

 I've created a release candidate for 0.20.204.0 that I would like to
 release.

 It is available at:
 http://people.apache.org/~omalley/hadoop-0.20.204.0-rc0/

 0.20.204.0 has many fixes including disk fail in place and the new rpm
 and
 deb packages. Fail in place allows the DataNode and TaskTracker to
 continue
 after a hard drive fails.


 Is it still failing to build according to Jenkins?










Re: compilation failing on tests contrib

2011-07-26 Thread Eli Collins
+cdh-user,  -common-dev(bcc)

Hi Keren,

Individual test failures for contrib projects are in
build/contrib/proj/test.   However you probably don't care since
there's no need to run all the tests just to build a tarball, just
remove the test test-c++-libhdfs from the command you're using to
build (that you previously posted).

Also, fwiw, none of these issues are specific to the Hadoop shipped in CDH.

Thanks,
Eli

On Tue, Jul 26, 2011 at 5:43 AM, Keren Ouaknine ker...@gmail.com wrote:
 Hello,

 I am in the process of compiling CDH3, using forrest 0.8 (thanks to Eli for
 pointing me to 7303).
 I still get a failure during the final phase: tests of contrib. Where can I
 see more details about the failure, and how can I re-test myself this part
 wo recompiling (took over 3 hours!).

 Thanks,
 Keren

 PS: I dont get replies to my emails, I need to find them on the net. They
 are not going to any filter of mine nor spam. Any idea?


 BUILD FAILED
 /a/fr-02/vol/netforce/phd/ouaknine/keren_CDH3/hadoop-0.20.2-cdh3u1/build.xml:1134:
 The following error occurred while executing this line:
 /a/fr-02/vol/netforce/phd/ouaknine/keren_CDH3/hadoop-0.20.2-cdh3u1/build.xml:1123:
 The following error occurred while executing this line:
 /a/fr-02/vol/netforce/phd/ouaknine/keren_CDH3/hadoop-0.20.2-cdh3u1/src/contrib/build.xml:62:
 Tests failed!

 Total time: 190 minutes 44 seconds

 build.xml, line 86 in bold:
  !-- == --
  !-- Test all the contrib system tests                     --
  !-- == --
  target name=test-system-contrib
    property name=hadoop.root location=${root}/../../..//
    property name=build.contrib.dir
 location=${hadoop.root}/build/contrib/
    delete file=${build.contrib.dir}/testsfailed/
    subant target=test-system
       property name=continueOnFailure value=true/
       property name=hadoop.home value=${hadoop.home}/
       property name=hadoop.conf.dir value=${hadoop.conf.dir}/
       property name=hadoop.conf.dir.deployed
           value=${hadoop.conf.dir.deployed}/
       fileset dir=. includes=hdfsproxy/build.xml/
       fileset dir=. includes=streaming/build.xml/
       fileset dir=. includes=fairscheduler/build.xml/
       fileset dir=. includes=capacity-scheduler/build.xml/
       fileset dir=. includes=gridmix/build.xml/
    /subant
    available file=${build.contrib.dir}/testsfailed
 property=testsfailed/
 *    fail if=testsfailedTests failed!/fail*
  /target

 --
 Keren Ouaknine
 Cell: +972 54 2565404
 Web: www.kereno.com



Re: compilation of CDH3

2011-07-25 Thread Eli Collins
+cdh-user  -common-dev(bcc)

Perhaps you're using forrest 0.9, the hadoop docs don't build with
forrest 0.9 (eg see HADOOP-7303), you need to use v 0.8 (and can
explicitly set it via -Dforrest.home).

You can use also use the binary target to build a tarball w/o docs.

Thanks,
Eli

On Mon, Jul 25, 2011 at 6:31 AM, Keren Ouaknine ker...@gmail.com wrote:
 Hello,

 I am compiling CDH3, and expect it to finish within 5-10 minutes.
 However the compilation process get stuck. It happens during forrest
 (probably documentation generation).

 I killed the forrest process (dont need documentation at this stage), and
 expected to see compilation continue but it didn't help.
 These are my flags:

 ant -Dversion=0.20.ka0 -Dcompile.native=true -Dcompile.c++=true -Dlibhdfs=1
 -Dlibrecordio=true clean api-report tar test test-c++-libhdfs

 Thanks,
 Keren

 --
 Keren Ouaknine
 Cell: +972 54 2565404
 Web: www.kereno.com



[jira] [Resolved] (HADOOP-7417) Hadoop Management System (Umbrella)

2011-07-05 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7417.
-

Resolution: Not A Problem

 Hadoop Management System (Umbrella)
 ---

 Key: HADOOP-7417
 URL: https://issues.apache.org/jira/browse/HADOOP-7417
 Project: Hadoop Common
  Issue Type: New Feature
 Environment: Java 6, Linux
Reporter: Eric Yang
Assignee: Eric Yang

 The primary goal of Hadoop Management System is to build a component around 
 management and deployment of Hadoop related projects. This includes software 
 installation, configuration, application orchestration, deployment automation 
 and monitoring Hadoop.
 Prototype demo source code can be obtained from:
 http://github.com/macroadster/hms

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7040) DiskChecker:mkdirsWithExistsCheck swallows FileNotFoundException.

2011-06-30 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7040.
-

   Resolution: Fixed
Fix Version/s: (was: 0.20.2)
   0.20.203.1

This was already fixed in 0.20.203.

 DiskChecker:mkdirsWithExistsCheck swallows FileNotFoundException.
 -

 Key: HADOOP-7040
 URL: https://issues.apache.org/jira/browse/HADOOP-7040
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.1, 0.20.2, 0.21.0
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Fix For: 0.20.203.1

 Attachments: 4132253-3.patch


 As a result, DataNode.checkDir will miss the exception (it catches 
 DiskErrorException, not FileNotFoundException), and fail instead of ignoring 
 the non-existent directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7409) TestTFileByteArrays is failing on Hudson

2011-06-28 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-7409.
-

Resolution: Duplicate

Looks like a dupe.

 TestTFileByteArrays is failing on Hudson
 

 Key: HADOOP-7409
 URL: https://issues.apache.org/jira/browse/HADOOP-7409
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.23.0
Reporter: Aaron T. Myers
 Fix For: 0.23.0


 This test has failed in the last 4 nightly builds, as seen here: 
 https://builds.apache.org/job/Hadoop-Common-trunk/
 I can't reproduce this failure on my machine, running the test either in 
 isolation or as part of the full suite.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7429) Common side of HDFS-2110

2011-06-27 Thread Eli Collins (JIRA)
Common side of HDFS-2110


 Key: HADOOP-7429
 URL: https://issues.apache.org/jira/browse/HADOOP-7429
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor


Common side of HDFS-2110, adds a new IOUtils copy bytes method and cleanup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Releases from branch-0.20

2011-06-24 Thread Eli Collins
I reverted the history that indicated there was a 0.20.3 release
(HADOOP-7372 ) and moved the two jiras with 0.20.4 fixVersion to be
0.20.3, and deleted the 0.20.4 fixVersion from jira.


On Thu, Jun 9, 2011 at 11:15 AM, Eli Collins e...@cloudera.com wrote:
 Hey guys,

 When filing HADOOP-7372 (CHANGES.txt claims there's a 0.20.3 release)
 I noticed there are fix versions for 0.20.3 and 0.20.4 on jira. If we
 don't intend to release 0.20.3 (Owen, you're the RM for branch-0.20
 right?) then we should remove these fix versions from jira as they're
 confusing to users, and should update the releases page
 (http://hadoop.apache.org/common/releases.html) to indicate there
 won't be a 20.3 or 20.4 release.

 Thanks,
 Eli



[jira] [Created] (HADOOP-7422) Test that the topology script is always passed IP addresses

2011-06-23 Thread Eli Collins (JIRA)
Test that the topology script is always passed IP addresses
---

 Key: HADOOP-7422
 URL: https://issues.apache.org/jira/browse/HADOOP-7422
 Project: Hadoop Common
  Issue Type: Test
  Components: test
Reporter: Eli Collins


Now that HADOOP-6682 has been fixed, Hadoop should always pass the topology 
script an IP address rather than a hostname. We should write a test that covers 
this (specifically that DNSToSwitchMapping#resolve is always passed IP 
addresses) so users can safely write topology scripts that don't handle 
hostnames.   

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7423) Document topology script requirements

2011-06-23 Thread Eli Collins (JIRA)
Document topology script requirements
-

 Key: HADOOP-7423
 URL: https://issues.apache.org/jira/browse/HADOOP-7423
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Eli Collins


The topology script documentation is cluster_setup.xml is unclear. The topology 
script:
# Only needs to handle IP addresses (not hostnames)
# Needs to handle multiple arguments for caching to work effectively

We should check in an example script or include an example one in the docs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7424) Log an error if the topology script doesn't handle multiple args

2011-06-23 Thread Eli Collins (JIRA)
Log an error if the topology script doesn't handle multiple args


 Key: HADOOP-7424
 URL: https://issues.apache.org/jira/browse/HADOOP-7424
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Eli Collins


ScriptBasedMapping#resolve currently warns and returns null if it passes n 
arguments to the topology script and gets back a different number of 
resolutions. This indicates a bug in the topology script (or it's input) and 
therefore should be an error.

{code}
// invalid number of entries returned by the script
LOG.warn(Script  + scriptName +  returned 
   + Integer.toString(m.size()) +  values when 
   + Integer.toString(names.size()) +  were expected.);
return null;
{code}

There's only one place in Hadoop (FSNamesystem init) where we pass multiple 
arguments to the topology script, and it only done for performance (to trigger 
resolution/caching of all the hosts in the includes file on startup). So 
currently a topology script that doesn't handle multiple arguments just means 
the initial cache population doesn't work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Releases from branch-0.20

2011-06-09 Thread Eli Collins
Hey guys,

When filing HADOOP-7372 (CHANGES.txt claims there's a 0.20.3 release)
I noticed there are fix versions for 0.20.3 and 0.20.4 on jira. If we
don't intend to release 0.20.3 (Owen, you're the RM for branch-0.20
right?) then we should remove these fix versions from jira as they're
confusing to users, and should update the releases page
(http://hadoop.apache.org/common/releases.html) to indicate there
won't be a 20.3 or 20.4 release.

Thanks,
Eli


<    1   2   3   4   >