[ https://issues.apache.org/jira/browse/HADOOP-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236645#comment-13236645 ]
FKorning commented on HADOOP-7682: ---------------------------------- There's a bunch of issues at work. I've patched this up locally on my own 1.0.2-SNAPSHOT, but it takes a lot of yak-shaving to fix. --- First you need to set up hadoop-1.0.1 including source, ant, ivy, and cygwin with ssh/ssl and tcp_wrappers. Then use sshd_config to create a cyg_server priviledged user. >From an admin cygwin shell, you then have to edit the /etc/passwd file and give that user a valid shell and user home, change the password for the user, and finally generate ssh keys for the user and copy the user's id_rsa.pub public key into ~/.ssh/authorized_keys. if done right you should be able to ssh cyg_server@localhost. --- Now the main problem is a confusion between the hadoop shell scripts that expect unix paths like /tmp, and the haddop java binaries who interpret this path as C:\tmp. Unfortunately, neither Cygwin symlinks nor even Windows NT Junctions are supported by the java io filesystem. Thus the only way to get around this is to enforce the cygwin paths to be identical to windows paths. I get around this by creating a circular symlink in "/cygwin" -> "/". To avoid confusion with "C:" drive mappings, all my paths are relative. This means that windows "\cygwin\tmp" equals cygwin's "/cygwin/tmp". For pid files use /cygwin/tmp/ For tmp file use /cygwin/tmp/haddop-${USER}/ For log files use /cygwin/tmp/haddop-${USER}/logs/ --- First the ssh slaves invocation warpper is broken because it fails to provide the user's ssh login, which isn't defaulted to in cygwin openssh. slaves.sh: for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do ssh -l $USER $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \ 2>&1 | sed "s/^/$slave: /" & if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then sleep $HADOOP_SLAVE_SLEEP fi done Next the hadoop shell scripts are broken. you need to fix the environments for cygwin paths in hadoop-env.sh, and then make sure this file is invoked by both hadoop-config.sh, and finally the hadoop* sh wrapper script. For me its JRE java invocation was also broken, so I provide the whole srcript below. hadoop-env.sh: HADOOP_PID_DIR=/cygwin/tmp/ HADOOP_TMP_DIR=/cygwin/tmp/hadoop-${USER} HADOOP_LOG_DIR=/cygwin/tmp/hadoop-${USER}/logs hadoop (sh): #!/usr/bin/env bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # The Hadoop command script # # Environment Variables # # JAVA_HOME The java implementation to use. Overrides JAVA_HOME. # # HADOOP_CLASSPATH Extra Java CLASSPATH entries. # # HADOOP_USER_CLASSPATH_FIRST When defined, the HADOOP_CLASSPATH is # added in the beginning of the global # classpath. Can be defined, for example, # by doing # export HADOOP_USER_CLASSPATH_FIRST=true # # HADOOP_HEAPSIZE The maximum amount of heap to use, in MB. # Default is 1000. # # HADOOP_OPTS Extra Java runtime options. # # HADOOP_NAMENODE_OPTS These options are added to HADOOP_OPTS # HADOOP_CLIENT_OPTS when the respective command is run. # HADOOP_{COMMAND}_OPTS etc HADOOP_JT_OPTS applies to JobTracker # for e.g. HADOOP_CLIENT_OPTS applies to # more than one command (fs, dfs, fsck, # dfsadmin etc) # # HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_HOME}/conf. # # HADOOP_ROOT_LOGGER The root appender. Default is INFO,console # bin=`dirname "$0"` bin=`cd "$bin"; pwd` cygwin=false case "`uname`" in CYGWIN*) cygwin=true;; esac if [ -e "$bin"/../libexec/hadoop-config.sh ]; then . "$bin"/../libexec/hadoop-config.sh else . "$bin"/hadoop-config.sh fi # if no args specified, show usage if [ $# = 0 ]; then echo "Usage: hadoop [--config confdir] COMMAND" echo "where COMMAND is one of:" echo " namenode -format format the DFS filesystem" echo " secondarynamenode run the DFS secondary namenode" echo " namenode run the DFS namenode" echo " datanode run a DFS datanode" echo " dfsadmin run a DFS admin client" echo " mradmin run a Map-Reduce admin client" echo " fsck run a DFS filesystem checking utility" echo " fs run a generic filesystem user client" echo " balancer run a cluster balancing utility" echo " fetchdt fetch a delegation token from the NameNode" echo " jobtracker run the MapReduce job Tracker node" echo " pipes run a Pipes job" echo " tasktracker run a MapReduce task Tracker node" echo " historyserver run job history servers as a standalone daemon" echo " job manipulate MapReduce jobs" echo " queue get information regarding JobQueues" echo " version print the version" echo " jar <jar> run a jar file" echo " distcp <srcurl> <desturl> copy file or directories recursively" echo " archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive" echo " classpath prints the class path needed to get the" echo " Hadoop jar and the required libraries" echo " daemonlog get/set the log level for each daemon" echo " or" echo " CLASSNAME run the class named CLASSNAME" echo "Most commands print help when invoked w/o parameters." exit 1 fi # get arguments COMMAND=$1 shift # Determine if we're starting a secure datanode, and if so, redefine appropriate variables if [ "$COMMAND" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER starting_secure_dn="true" fi if [ "$JAVA_HOME" != "" ]; then #echo "JAVA_HOME: $JAVA_HOME" JAVA_HOME="$JAVA_HOME" fi # some Java parameters if $cygwin; then JAVA_HOME=`cygpath -w "$JAVA_HOME"` #echo "cygwin JAVA_HOME: $JAVA_HOME" fi if [ "$JAVA_HOME" == "" ]; then echo "Error: JAVA_HOME is not set: $JAVA_HOME" exit 1 fi JAVA=$JAVA_HOME/bin/java JAVA_HEAP_MAX=-Xmx1000m # check envvars which might override default args if [ "$HADOOP_HEAPSIZE" != "" ]; then #echo "run with heapsize $HADOOP_HEAPSIZE" JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m" #echo $JAVA_HEAP_MAX fi # CLASSPATH initially contains $HADOOP_CONF_DIR CLASSPATH="${HADOOP_CONF_DIR}" if [ "$HADOOP_USER_CLASSPATH_FIRST" != "" ] && [ "$HADOOP_CLASSPATH" != "" ] ; then CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH} fi CLASSPATH=${CLASSPATH}:$JAVA_HOME/lib/tools.jar # for developers, add Hadoop classes to CLASSPATH if [ -d "$HADOOP_HOME/build/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/classes fi if [ -d "$HADOOP_HOME/build/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build fi if [ -d "$HADOOP_HOME/build/test/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/test/classes fi if [ -d "$HADOOP_HOME/build/tools" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/tools fi # so that filenames w/ spaces are handled correctly in loops below IFS= # for releases, add core hadoop jar & webapps to CLASSPATH if [ -e $HADOOP_PREFIX/share/hadoop/hadoop-core-* ]; then # binary layout if [ -d "$HADOOP_PREFIX/share/hadoop/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_PREFIX/share/hadoop fi for f in $HADOOP_PREFIX/share/hadoop/hadoop-core-*.jar; do CLASSPATH=${CLASSPATH}:$f; done # add libs to CLASSPATH for f in $HADOOP_PREFIX/share/hadoop/lib/*.jar; do CLASSPATH=${CLASSPATH}:$f; done for f in $HADOOP_PREFIX/share/hadoop/lib/jsp-2.1/*.jar; do CLASSPATH=${CLASSPATH}:$f; done for f in $HADOOP_PREFIX/share/hadoop/hadoop-tools-*.jar; do TOOL_PATH=${TOOL_PATH}:$f; done else # tarball layout if [ -d "$HADOOP_HOME/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HOME fi for f in $HADOOP_HOME/hadoop-core-*.jar; do CLASSPATH=${CLASSPATH}:$f; done # add libs to CLASSPATH for f in $HADOOP_HOME/lib/*.jar; do CLASSPATH=${CLASSPATH}:$f; done if [ -d "$HADOOP_HOME/build/ivy/lib/Hadoop/common" ]; then for f in $HADOOP_HOME/build/ivy/lib/Hadoop/common/*.jar; do CLASSPATH=${CLASSPATH}:$f; done fi for f in $HADOOP_HOME/lib/jsp-2.1/*.jar; do CLASSPATH=${CLASSPATH}:$f; done for f in $HADOOP_HOME/hadoop-tools-*.jar; do TOOL_PATH=${TOOL_PATH}:$f; done for f in $HADOOP_HOME/build/hadoop-tools-*.jar; do TOOL_PATH=${TOOL_PATH}:$f; done fi # add user-specified CLASSPATH last if [ "$HADOOP_USER_CLASSPATH_FIRST" = "" ] && [ "$HADOOP_CLASSPATH" != "" ]; then CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH} fi # default log directory & file if [ "$HADOOP_LOG_DIR" = "" ]; then HADOOP_LOG_DIR="$HADOOP_HOME/logs" fi if [ "$HADOOP_LOGFILE" = "" ]; then HADOOP_LOGFILE='hadoop.log' fi # default policy file for service-level authorization if [ "$HADOOP_POLICYFILE" = "" ]; then HADOOP_POLICYFILE="hadoop-policy.xml" fi # restore ordinary behaviour unset IFS # figure out which class to run if [ "$COMMAND" = "classpath" ] ; then if $cygwin; then CLASSPATH=`cygpath -wp "$CLASSPATH"` fi echo $CLASSPATH exit elif [ "$COMMAND" = "namenode" ] ; then CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS" elif [ "$COMMAND" = "secondarynamenode" ] ; then CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode' HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS" elif [ "$COMMAND" = "datanode" ] ; then CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode' if [ "$starting_secure_dn" = "true" ]; then HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS" else HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS" fi elif [ "$COMMAND" = "fs" ] ; then CLASS=org.apache.hadoop.fs.FsShell HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "dfs" ] ; then CLASS=org.apache.hadoop.fs.FsShell HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "dfsadmin" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "mradmin" ] ; then CLASS=org.apache.hadoop.mapred.tools.MRAdmin HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "fsck" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DFSck HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "balancer" ] ; then CLASS=org.apache.hadoop.hdfs.server.balancer.Balancer HADOOP_OPTS="$HADOOP_OPTS $HADOOP_BALANCER_OPTS" elif [ "$COMMAND" = "fetchdt" ] ; then CLASS=org.apache.hadoop.hdfs.tools.DelegationTokenFetcher elif [ "$COMMAND" = "jobtracker" ] ; then CLASS=org.apache.hadoop.mapred.JobTracker HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOBTRACKER_OPTS" elif [ "$COMMAND" = "historyserver" ] ; then CLASS=org.apache.hadoop.mapred.JobHistoryServer HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOB_HISTORYSERVER_OPTS" elif [ "$COMMAND" = "tasktracker" ] ; then CLASS=org.apache.hadoop.mapred.TaskTracker HADOOP_OPTS="$HADOOP_OPTS $HADOOP_TASKTRACKER_OPTS" elif [ "$COMMAND" = "job" ] ; then CLASS=org.apache.hadoop.mapred.JobClient HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "queue" ] ; then CLASS=org.apache.hadoop.mapred.JobQueueClient HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "pipes" ] ; then CLASS=org.apache.hadoop.mapred.pipes.Submitter HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "version" ] ; then CLASS=org.apache.hadoop.util.VersionInfo HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "jar" ] ; then CLASS=org.apache.hadoop.util.RunJar HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "distcp" ] ; then CLASS=org.apache.hadoop.tools.DistCp CLASSPATH=${CLASSPATH}:${TOOL_PATH} HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "daemonlog" ] ; then CLASS=org.apache.hadoop.log.LogLevel HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "archive" ] ; then CLASS=org.apache.hadoop.tools.HadoopArchives CLASSPATH=${CLASSPATH}:${TOOL_PATH} HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" elif [ "$COMMAND" = "sampler" ] ; then CLASS=org.apache.hadoop.mapred.lib.InputSampler HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" else CLASS=$COMMAND fi # cygwin path translation if $cygwin; then JAVA_HOME=`cygpath -w "$JAVA_HOME"` CLASSPATH=`cygpath -wp "$CLASSPATH"` HADOOP_HOME=`cygpath -w "$HADOOP_HOME"` HADOOP_LOG_DIR=`cygpath -w "$HADOOP_LOG_DIR"` TOOL_PATH=`cygpath -wp "$TOOL_PATH"` fi # setup 'java.library.path' for native-hadoop code if necessary JAVA_LIBRARY_PATH='' if [ -d "${HADOOP_HOME}/build/native" -o -d "${HADOOP_HOME}/lib/native" -o -e "${HADOOP_PREFIX}/lib/libhadoop.a" ]; then JAVA_PLATFORM=`${JAVA} -classpath ${CLASSPATH} -Xmx32m ${HADOOP_JAVA_PLATFORM_OPTS} org.apache.hadoop.util.PlatformName | sed -e "s/ /_/g"` #echo "JAVA_PLATFORM: $JAVA_PLATFORM" if [ "$JAVA_PLATFORM" = "Windows_7-amd64-64" ]; then JSVC_ARCH="amd64" elif [ "$JAVA_PLATFORM" = "Linux-amd64-64" ]; then JSVC_ARCH="amd64" else JSVC_ARCH="i386" fi if [ -d "$HADOOP_HOME/build/native" ]; then JAVA_LIBRARY_PATH=${HADOOP_HOME}/build/native/${JAVA_PLATFORM}/lib fi if [ -d "${HADOOP_HOME}/lib/native" ]; then if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_HOME}/lib/native/${JAVA_PLATFORM} else JAVA_LIBRARY_PATH=${HADOOP_HOME}/lib/native/${JAVA_PLATFORM} fi fi if [ -e "${HADOOP_PREFIX}/lib/libhadoop.a" ]; then JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/lib fi fi # cygwin path translation if $cygwin; then JAVA_LIBRARY_PATH=`cygpath -wp "$JAVA_LIBRARY_PATH"` PATH="/cygwin/bin:/cygwin/usr/bin:`cygpath -p ${PATH}`" fi HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.tmp.dir=$HADOOP_TMP_DIR" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.file=$HADOOP_LOGFILE" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_HOME" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}" #turn security logger on the namenode and jobtracker only if [ $COMMAND = "namenode" ] || [ $COMMAND = "jobtracker" ]; then HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,DRFAS}" else HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}" fi if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH" fi HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.policy.file=$HADOOP_POLICYFILE" # Check to see if we should start a secure datanode if [ "$starting_secure_dn" = "true" ]; then if [ "$HADOOP_PID_DIR" = "" ]; then HADOOP_SECURE_DN_PID="/tmp/hadoop_secure_dn.pid" else HADOOP_SECURE_DN_PID="$HADOOP_PID_DIR/hadoop_secure_dn.pid" fi exec "$HADOOP_HOME/libexec/jsvc.${JSVC_ARCH}" -Dproc_$COMMAND -outfile "$HADOOP_LOG_DIR/jsvc.out" \ -errfile "$HADOOP_LOG_DIR/jsvc.err" \ -pidfile "$HADOOP_SECURE_DN_PID" \ -nodetach \ -user "$HADOOP_SECURE_DN_USER" \ -cp "$CLASSPATH" \ $JAVA_HEAP_MAX $HADOOP_OPTS \ org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@" else # run it exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS -classpath "$CLASSPATH" $CLASS "$@" fi ---- Next the hadoop fs and utilities are broken, as they expect shells with POSIX /bin executables in their path (bash,chmod,chown,chgrp) For various reasons it's a real bad idea to add "/cygwin/bin" to your windows path, so we're going to have to fix the utility classes to be cygwin aware and use the "/cygwin/bin" binaries instead. This is why you need the source, because we're going to have to fix the java source and recompile the hadoop core libraries (and why you need ant ivy). ---- Before we do this, the contrib Gridmix is broken as it uses a strange generic Enum code that just craps out in jdk/jre 1.7 and above. The fix is to dumb it down and use untyped Enums. Gridmix.java: /* private <T> String getEnumValues(Enum<? extends T>[] e) { StringBuilder sb = new StringBuilder(); String sep = ""; for (Enum<? extends T> v : e) { sb.append(sep); sb.append(v.name()); sep = "|"; } return sb.toString(); } */ private String getEnumValues(Enum[] e) { StringBuilder sb = new StringBuilder(); String sep = ""; for (Enum v : e) { sb.append(sep); sb.append(v.name()); sep = "|"; } return sb.toString(); } --- next first the ivy build.xml and build-contrib scripts are broken, as they fail to set the correct compiler javac.target=1.7 everywhere. modify all of these to include the following in all javac targets: build-contrib.xml: <property name="javac.debug" value="on"/> <property name="javac.version" value="1.7"/> ... <!-- ====================================================== --> <!-- Compile a Hadoop contrib's files --> <!-- ====================================================== --> <target name="compile" depends="init, ivy-retrieve-common" unless="skip.contrib"> <echo message="contrib: ${name}"/> <javac encoding="${build.encoding}" srcdir="${src.dir}" includes="**/*.java" destdir="${build.classes}" target="${javac.version}" source="${javac.version}" optimize="${javac.optimize}" debug="${javac.debug}" deprecation="${javac.deprecation}"> <classpath refid="contrib-classpath"/> </javac> </target> --- Next we fix the hadoop utilities Shell.java to use cygwin paths: Shell.java: /** Set to true on Windows platforms */ public static final boolean WINDOWS /* borrowed from Path.WINDOWS */ = System.getProperty("os.name").startsWith("Windows"); /** a Unix command to get the current user's name */ public final static String USER_NAME_COMMAND = (WINDOWS ? "/cygwin/bin/whoami" : "whoami"); /** a Unix command to get the current user's groups list */ public static String[] getGroupsCommand() { return new String[]{ (WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c", "groups"}; } /** a Unix command to get a given user's groups list */ public static String[] getGroupsForUserCommand(final String user) { //'groups username' command return is non-consistent across different unixes return new String [] {(WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c", "id -Gn " + user}; } /** a Unix command to get a given netgroup's user list */ public static String[] getUsersForNetgroupCommand(final String netgroup) { //'groups username' command return is non-consistent across different unixes return new String [] {(WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c", "getent netgroup " + netgroup}; } /** Return a Unix command to get permission information. */ public static String[] getGET_PERMISSION_COMMAND() { //force /bin/ls, except on windows. return new String[] {(WINDOWS ? "/cygwin/bin/ls" : "/bin/ls"), "-ld"}; } /** a Unix command to set permission */ public static final String SET_PERMISSION_COMMAND = (WINDOWS ? "/cygwin/bin/chmod" : "chmod"); /** a Unix command to set owner */ public static final String SET_OWNER_COMMAND = (WINDOWS ? "/cygwin/bin/chown" : "chown"); /** a Unix command to set group */ public static final String SET_GROUP_COMMAND = (WINDOWS ? "/cygwin/bin/chgrp" : "chgrp"); /** a Unix command to get ulimit of a process. */ public static final String ULIMIT_COMMAND = "ulimit"; ---- Lastly and despite this fix, hadoop filesystem's FileUtil complains about RawLocalFileSystem, breaking during the directory creation and verification because the shell's return value is improperly parsed. You can fix this in a number of ways. I took the lazy approach and just made all mkdir functions catch all IOExceptions silently. RawLocalFileSystem.java: /** * Creates the specified directory hierarchy. Does not * treat existence as an error. */ public boolean mkdirs(Path f) throws IOException { boolean b = false; try { Path parent = f.getParent(); File p2f = pathToFile(f); b = (parent == null || mkdirs(parent)) && (p2f.mkdir() || p2f.isDirectory()); } catch (IOException e) {} return b; } /** {@inheritDoc} */ @Override public boolean mkdirs(Path f, FsPermission permission) throws IOException { boolean b = false; try { b = mkdirs(f); setPermission(f, permission); } catch (IOException e) {} return b; } --- Finally, rebuild hadoop with "ant -f build.xml compile". copy the jars in the build directory oevrwriting the existing jars in the hadoop home parent directory. reformat the namenode. and run start-all.sh. you should see 4 java processes for the namenode, datanode, jobtracker, and tasktracker. that was a lot of yak shaving just to get this running. > taskTracker could not start because "Failed to set permissions" to "ttprivate > to 0700" > -------------------------------------------------------------------------------------- > > Key: HADOOP-7682 > URL: https://issues.apache.org/jira/browse/HADOOP-7682 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 0.20.203.0, 0.20.205.0, 1.0.0 > Environment: OS:WindowsXP SP3 , Filesystem :NTFS, cygwin 1.7.9-1, > jdk1.6.0_05 > Reporter: Magic Xie > > ERROR org.apache.hadoop.mapred.TaskTracker:Can not start task tracker because > java.io.IOException:Failed to set permissions of > path:/tmp/hadoop-cyg_server/mapred/local/ttprivate to 0700 > at > org.apache.hadoop.fs.RawLocalFileSystem.checkReturnValue(RawLocalFileSystem.java:525) > at > org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:499) > at > org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:318) > at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183) > at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:635) > at org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:1328) > at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3430) > Since hadoop0.20.203 when the TaskTracker initialize, it checks the > permission(TaskTracker Line 624) of > (org.apache.hadoop.mapred.TaskTracker.TT_LOG_TMP_DIR,org.apache.hadoop.mapred.TaskTracker.TT_PRIVATE_DIR, > > org.apache.hadoop.mapred.TaskTracker.TT_PRIVATE_DIR).RawLocalFileSystem(http://svn.apache.org/viewvc/hadoop/common/tags/release-0.20.203.0/src/core/org/apache/hadoop/fs/RawLocalFileSystem.java?view=markup) > call setPermission(Line 481) to deal with it, setPermission works fine on > *nx, however,it dose not alway works on windows. > setPermission call setReadable of Java.io.File in the line 498, but according > to the Table1 below provided by oracle,setReadable(false) will always return > false on windows, the same as setExecutable(false). > http://java.sun.com/developer/technicalArticles/J2SE/Desktop/javase6/enhancements/ > is it cause the task tracker "Failed to set permissions" to "ttprivate to > 0700"? > Hadoop 0.20.202 works fine in the same environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira