Hello, I am using hadoop-0.20.2-cdh3u1. First question: can one ommit sorting in streaming (e.g. when one only sums numbers)? Second question: Why do I have to run my jobs from empty current working directory? When I run it from my home, I get this: 13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad input path : File file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not exist. The path seems like total nonsense. Thanks, Pavel Hančar
P.S. The whole thing: xhancar@alba:~$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar /packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar -D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true -D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output output -mapper /home/xhancar/dp/bin/wcl.sh -file /home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file /home/xhancar/dp/bin/sum.sh -inputformat org.apache.hadoop.mapred.TextInputFormat packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh, /tmp/hadoop-xhancar/hadoop-unjar1228378690975633202/] [] /tmp/streamjob8038106018846805847.jar tmpDir=null 13/12/19 16:22:40 INFO mapred.JobClient: Cleaning up the staging area hdfs://alba:9000/tmp/hadoop-hadoopnlp/mapred/staging/xhancar/.staging/job_201312191531_0012 13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad input path : File file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not exist. Streaming Command Failed! But: xhancar@alba:~$ cd empty/ xhancar@alba:~/empty$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar /packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar -D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true -D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output output -mapper /home/xhancar/dp/bin/wcl.sh -file /home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file /home/xhancar/dp/bin/sum.sh -inputformat org.apache.hadoop.mapred.TextInputFormat packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh, /tmp/hadoop-xhancar/hadoop-unjar928216275517356553/] [] /tmp/streamjob361197118805255140.jar tmpDir=null 13/12/19 16:22:53 WARN snappy.LoadSnappy: Snappy native library is available 13/12/19 16:22:53 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/12/19 16:22:53 INFO snappy.LoadSnappy: Snappy native library loaded 13/12/19 16:22:53 INFO mapred.FileInputFormat: Total input paths to process : 1 13/12/19 16:22:54 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-xhancar/mapred/local] 13/12/19 16:22:54 INFO streaming.StreamJob: Running job: job_201312191531_0013 13/12/19 16:22:54 INFO streaming.StreamJob: To kill this job, run: 13/12/19 16:22:54 INFO streaming.StreamJob: /packages/run.64/hadoop-0.20.2-cdh3u1/bin/../bin/hadoop job -Dmapred.job.tracker=alba:9001 -kill job_201312191531_0013 13/12/19 16:22:54 INFO streaming.StreamJob: Tracking URL: http://alba.fi.muni.cz:50030/jobdetails.jsp?jobid=job_201312191531_0013 13/12/19 16:22:55 INFO streaming.StreamJob: map 0% reduce 0% 13/12/19 16:23:01 INFO streaming.StreamJob: map 100% reduce 0% 13/12/19 16:23:09 INFO streaming.StreamJob: map 100% reduce 33% 13/12/19 16:23:11 INFO streaming.StreamJob: map 100% reduce 100% 13/12/19 16:23:14 INFO streaming.StreamJob: Job complete: job_201312191531_0013 13/12/19 16:23:14 INFO streaming.StreamJob: Output: output xhancar@alba:~/empty$