HADOOP-13107. clean up how rumen is executed
Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/9978d722 Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/9978d722 Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/9978d722 Branch: refs/heads/HADOOP-12930 Commit: 9978d722805c97103466ea9ba7b8e4c197318c3f Parents: a894b44 Author: Allen Wittenauer <a...@apache.org> Authored: Fri May 6 08:47:31 2016 -0700 Committer: Allen Wittenauer <a...@apache.org> Committed: Thu May 12 16:01:27 2016 -0700 ---------------------------------------------------------------------- .../main/resources/assemblies/hadoop-tools.xml | 8 +++ .../src/main/shellprofile.d/hadoop-rumen.sh | 58 ++++++++++++++++++++ .../hadoop-rumen/src/site/markdown/Rumen.md.vm | 40 +++----------- 3 files changed, 75 insertions(+), 31 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hadoop/blob/9978d722/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml ---------------------------------------------------------------------- diff --git a/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml b/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml index c5ea6ad..8606e23 100644 --- a/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml +++ b/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml @@ -133,6 +133,14 @@ </includes> </fileSet> <fileSet> + <directory>../hadoop-rumen/src/main/shellprofile.d</directory> + <includes> + <include>*</include> + </includes> + <outputDirectory>/libexec/shellprofile.d</outputDirectory> + <fileMode>0755</fileMode> + </fileSet> + <fileSet> <directory>../hadoop-streaming/target</directory> <outputDirectory>/share/hadoop/${hadoop.component}/sources</outputDirectory> <includes> http://git-wip-us.apache.org/repos/asf/hadoop/blob/9978d722/hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh ---------------------------------------------------------------------- diff --git a/hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh b/hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh new file mode 100755 index 0000000..d7d4022 --- /dev/null +++ b/hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh @@ -0,0 +1,58 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +if ! declare -f hadoop_subcommand_rumenfolder >/dev/null 2>/dev/null; then + + if [[ "${HADOOP_SHELL_EXECNAME}" = hadoop ]]; then + hadoop_add_subcommand "rumenfolder" "scale a rumen input trace" + fi + +## @description rumenfolder command for hadoop +## @audience public +## @stability stable +## @replaceable yes +function hadoop_subcommand_rumenfolder +{ + # shellcheck disable=SC2034 + HADOOP_CLASSNAME=org.apache.hadoop.tools.rumen.Folder + hadoop_add_to_classpath_tools hadoop-rumen + hadoop_debug "Appending HADOOP_CLIENT_OPTS onto HADOOP_OPTS" + HADOOP_OPTS="${HADOOP_OPTS} ${HADOOP_CLIENT_OPTS}" +} + +fi + +if ! declare -f hadoop_subcommand_rumentrace >/dev/null 2>/dev/null; then + + if [[ "${HADOOP_SHELL_EXECNAME}" = hadoop ]]; then + hadoop_add_subcommand "rumentrace" "convert logs into a rumen trace" + fi + +## @description rumentrace command for hadoop +## @audience public +## @stability stable +## @replaceable yes +function hadoop_subcommand_rumentrace +{ + # shellcheck disable=SC2034 + HADOOP_CLASSNAME=org.apache.hadoop.tools.rumen.TraceBuilder + hadoop_add_to_classpath_tools hadoop-rumen + hadoop_debug "Appending HADOOP_CLIENT_OPTS onto HADOOP_OPTS" + HADOOP_OPTS="${HADOOP_OPTS} ${HADOOP_CLIENT_OPTS}" +} + +fi http://git-wip-us.apache.org/repos/asf/hadoop/blob/9978d722/hadoop-tools/hadoop-rumen/src/site/markdown/Rumen.md.vm ---------------------------------------------------------------------- diff --git a/hadoop-tools/hadoop-rumen/src/site/markdown/Rumen.md.vm b/hadoop-tools/hadoop-rumen/src/site/markdown/Rumen.md.vm index bee976a..34dfd0b 100644 --- a/hadoop-tools/hadoop-rumen/src/site/markdown/Rumen.md.vm +++ b/hadoop-tools/hadoop-rumen/src/site/markdown/Rumen.md.vm @@ -50,8 +50,8 @@ but a simulation of the scheduler elects to run that task on a remote rack, the simulator requires a runtime its input cannot provide. To fill in these gaps, Rumen performs a statistical analysis of the digest to estimate the variables the trace doesn't supply. Rumen traces -drive both Gridmix (a benchmark of Hadoop MapReduce clusters) and Mumak -(a simulator for the JobTracker). +drive both Gridmix (a benchmark of Hadoop MapReduce clusters) and SLS +(a simulator for the resource manager scheduler). $H3 Motivation @@ -126,16 +126,13 @@ can use the `Folder` utility to fold the current trace to the desired length. The remaining part of this section explains these utilities in detail. -Examples in this section assumes that certain libraries are present -in the java CLASSPATH. See [Dependencies](#Dependencies) for more details. - $H3 Trace Builder $H4 Command ``` -java org.apache.hadoop.tools.rumen.TraceBuilder [options] <jobtrace-output> <topology-output> <inputs> +hadoop rumentrace [options] <jobtrace-output> <topology-output> <inputs> ``` This command invokes the `TraceBuilder` utility of *Rumen*. @@ -205,12 +202,8 @@ $H4 Options $H4 Example -*Rumen* expects certain library *JARs* to be present in the *CLASSPATH*. -One simple way to run Rumen is to use -`$HADOOP_HOME/bin/hadoop jar` command to run it as example below. - ``` -java org.apache.hadoop.tools.rumen.TraceBuilder \ +hadoop rumentrace \ file:///tmp/job-trace.json \ file:///tmp/job-topology.json \ hdfs:///tmp/hadoop-yarn/staging/history/done_intermediate/testuser @@ -229,7 +222,7 @@ $H3 Folder $H4 Command ``` -java org.apache.hadoop.tools.rumen.Folder [options] [input] [output] +hadoop rumenfolder [options] [input] [output] ``` This command invokes the `Folder` utility of @@ -350,7 +343,7 @@ $H4 Examples $H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime ``` -java org.apache.hadoop.tools.rumen.Folder \ +hadoop rumenfolder \ -output-duration 1h \ -input-cycle 20m \ file:///tmp/job-trace.json \ @@ -362,7 +355,7 @@ If the folded jobs are out of order then the command will bail out. $H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime and tolerate some skewness ``` -java org.apache.hadoop.tools.rumen.Folder \ +hadoop rumenfolder \ -output-duration 1h \ -input-cycle 20m \ -allow-missorting \ @@ -378,7 +371,7 @@ If the folded jobs are out of order, then atmost $H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime in debug mode ``` -java org.apache.hadoop.tools.rumen.Folder \ +hadoop rumenfolder \ -output-duration 1h \ -input-cycle 20m \ -debug -temp-directory file:///tmp/debug \ @@ -395,7 +388,7 @@ up. $H5 Folding an input trace with 10 hours of total runtime to generate an output trace with 1 hour of total runtime with custom concentration. ``` -java org.apache.hadoop.tools.rumen.Folder \ +hadoop rumenfolder \ -output-duration 1h \ -input-cycle 20m \ -concentration 2 \ @@ -421,18 +414,3 @@ Look at the MapReduce <a href="https://issues.apache.org/jira/browse/MAPREDUCE/component/12313617">rumen-component</a> for further details. - -$H3 Dependencies - -*Rumen* expects certain library *JARs* to be present in the *CLASSPATH*. -One simple way to run Rumen is to use -`hadoop jar` command to run it as example below. - -``` -$HADOOP_HOME/bin/hadoop jar \ - $HADOOP_HOME/share/hadoop/tools/lib/hadoop-rumen-2.5.1.jar \ - org.apache.hadoop.tools.rumen.TraceBuilder \ - file:///tmp/job-trace.json \ - file:///tmp/job-topology.json \ - hdfs:///tmp/hadoop-yarn/staging/history/done_intermediate/testuser -``` --------------------------------------------------------------------- To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-commits-h...@hadoop.apache.org