[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-22 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053093#comment-13053093
 ] 

Eli Collins commented on HDFS-2045:
---

If doesn't impose a big burden I think we should preserve the ability to run 
Hadoop from distinct build dirs. I use this pretty frequently, and from recent 
list xfic it seems others do as well.

 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-14 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049576#comment-13049576
 ] 

Aaron T. Myers commented on HDFS-2045:
--

In the absence of compatibility concerns, I agree with option 2. I'm sorry we 
didn't go with this in the first place.

That said, I don't feel strongly about maintaining the backward compatibility 
of this feature of being able to run Hadoop from distinct mr/hdfs/common 
build directories.

So, if no one else feels strongly about this, then I'm fine rolling with option 
2 (and closing this issue.)

 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045631#comment-13045631
 ] 

Owen O'Malley commented on HDFS-2045:
-

The HADOOP_*_HOME variables were a bad idea that I take full responsibility for.

The goal should be to move the deployment and development environments closer 
together rather than have two completely different structures.

Currently, you can do:

{code}
cd common
ant rpm
rpm -i build/hadoop-common-*.rpm
cd ../hdfs
ant rpm
rpm -i build/hadoop-hdfs-*.rpm
{code}

which is far easier to understand and you are running it like it is deployed.

Of course, you can also deploy using the tarballs instead of rpm or debs.

Do you have ideas for making the dev easier? I assume part of Nigel's goal of 
moving the subversion trees to gether is to eventually have a shared build 
directory.


 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045633#comment-13045633
 ] 

Owen O'Malley commented on HDFS-2045:
-

I'm sorry, I read too fast. Are you just proposing to fix the hdfs-config.sh?

 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-07 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045648#comment-13045648
 ] 

Eric Yang commented on HDFS-2045:
-

The state of current code in trunk.

Developers:

# Direct source tree execution, specify HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, 
HADOOP_MAPRED_HOME works.
# Source tarball, same as 1.

Ops:

# Binary tarball, decompress all binary tarball to the same destination without 
prefix.  There is no need to use environment variable.
# RPM/DEB works as described in Owen's comment, no need to specify environment 
variables.



 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-07 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045688#comment-13045688
 ] 

Aaron T. Myers commented on HDFS-2045:
--

bq. I'm sorry, I read too fast. Are you just proposing to fix the 
hdfs-config.sh?

Unfortunately, I think it would take more than that to restore the previous 
behavior. Since the jars, etc are now located based solely on the single 
HADOOP_PREFIX env var, one cannot have separate distribution directories for 
HDFS, MR, and Common.

My main issue is that all of this rearranging/reworking of the tarball and 
development layouts were done under the auspices of adding RPM support. I'll be 
honest and say that I didn't review the RPM work hardly at all, because I 
assumed from the title that it would not affect tarballs or the dev 
environment. Had there been separate JIRAs along the lines of rearrange the 
layout of distribution builds and add RPM support, I likely would have 
reviewed the former.

 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2045) HADOOP_*_HOME environment variables no longer work for tar ball distributions

2011-06-07 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045694#comment-13045694
 ] 

Eric Yang commented on HDFS-2045:
-

HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_MAPRED_HOME were results of 
splitting the source code into three different submodules.  While this works 
fine for developer to isolate each project, it makes configuration difficult 
for production use. HDFS and MAPRED run as their own uid.  The amount of 
configuration just multiples.

To solve this problem, there are a couple options:

Option 1.  Modify jar file which contains all common shell script in common jar 
file, when binary tarball is built, the common shell scripts are rearranged 
submerged into the binary tarball distribution, and completely remove 
HADOOP_*_HOME environment variables.  $HADOOP_PREFIX is the only hint 
(generated from shell script path, no need to define in the environment) to all 
hadoop programs where the bits are exactly layout.  When HDFS or MAPREDUCE is 
deployed, there is no need to deploy COMMON tarball.  To make this work for 
developers, *-config.sh should be moved to $HADOOP_PREFIX/libexec.  During the 
build process, hadoop-common-*.jar is extract for common shell scripts.  Both 
developer and binary layout are closer to each other.  (When project is 
converted to maven, this keeps hdfs/mapreduce loosely coupled and reduce 
duplicated shell scripts.)

Option 2. Preserve HADOOP_*_HOME for source code execution.  Environment driven 
layout does not work on binary tarball. Change the prefix tarball from 
hadoop-[common|mapred|hdfs]-0.23.0-SNAPSHOT to hadoop-[version] for easy 
extraction.

Option 3.  Enable HADOOP_*_HOME for binary tarball.  (Risk of crashing the 
system due to bad environment variable setup)

Option 4.  Merge hdfs/mapreduce back to the same project, but create as 
subdirectories to reduce duplicated shell scripts.

I am incline to vote for option 2.

 HADOOP_*_HOME environment variables no longer work for tar ball distributions
 -

 Key: HDFS-2045
 URL: https://issues.apache.org/jira/browse/HDFS-2045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aaron T. Myers

 It used to be that you could do the following:
 # Run `ant bin-package' in your hadoop-common checkout.
 # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
 # Run `ant bin-package' in your hadoop-hdfs checkout.
 # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
 # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
 # Run `hdfs'.
 \\
 \\
 As of HDFS-1963, this no longer works since hdfs-config.sh is looking in 
 HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in 
 HADOOP_COMMON_HOME/libexec.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira