Modified: incubator/samza/site/learn/tutorials/0.7.0/run-in-multi-node-yarn.html
URL: 
http://svn.apache.org/viewvc/incubator/samza/site/learn/tutorials/0.7.0/run-in-multi-node-yarn.html?rev=1612998&r1=1612997&r2=1612998&view=diff
==============================================================================
--- incubator/samza/site/learn/tutorials/0.7.0/run-in-multi-node-yarn.html 
(original)
+++ incubator/samza/site/learn/tutorials/0.7.0/run-in-multi-node-yarn.html Thu 
Jul 24 05:05:00 2014
@@ -133,24 +133,24 @@
 
 <p>1. Dowload <a 
href="http://mirror.symnds.com/software/Apache/hadoop/common/hadoop-2.3.0/hadoop-2.3.0.tar.gz";>YARN
 2.3</a> to /tmp and untar it.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash"><span 
class="nb">cd</span> /tmp
+<div class="highlight"><pre><code class="bash"><span class="nb">cd</span> /tmp
 tar -xvf hadoop-2.3.0.tar.gz
 <span class="nb">cd </span>hadoop-2.3.0</code></pre></div>
 
 <p>2. Set up environment variables.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash"><span 
class="nb">export </span><span class="nv">HADOOP_YARN_HOME</span><span 
class="o">=</span><span class="k">$(</span><span class="nb">pwd</span><span 
class="k">)</span>
+<div class="highlight"><pre><code class="bash"><span class="nb">export 
</span><span class="nv">HADOOP_YARN_HOME</span><span class="o">=</span><span 
class="k">$(</span><span class="nb">pwd</span><span class="k">)</span>
 mkdir conf
 <span class="nb">export </span><span class="nv">HADOOP_CONF_DIR</span><span 
class="o">=</span><span 
class="nv">$HADOOP_YARN_HOME</span>/conf</code></pre></div>
 
 <p>3. Configure YARN setting file.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">cp 
./etc/hadoop/yarn-site.xml conf
+<div class="highlight"><pre><code class="bash">cp ./etc/hadoop/yarn-site.xml 
conf
 vi conf/yarn-site.xml</code></pre></div>
 
 <p>Add the following property to yarn-site.xml:</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span 
class="nt">&lt;property&gt;</span>
+<div class="highlight"><pre><code class="xml"><span 
class="nt">&lt;property&gt;</span>
     <span class="nt">&lt;name&gt;</span>yarn.resourcemanager.hostname<span 
class="nt">&lt;/name&gt;</span>
     <span class="c">&lt;!-- hostname that is accessible from all NMs 
--&gt;</span>
     <span class="nt">&lt;value&gt;</span>yourHostname<span 
class="nt">&lt;/value&gt;</span>
@@ -165,23 +165,23 @@ vi conf/yarn-site.xml</code></pre></div>
 
 <p>4. Download Scala package and untar it.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash"><span 
class="nb">cd</span> /tmp
+<div class="highlight"><pre><code class="bash"><span class="nb">cd</span> /tmp
 curl http://www.scala-lang.org/files/archive/scala-2.10.3.tgz &gt; 
scala-2.10.3.tgz
 tar -xvf scala-2.10.3.tgz</code></pre></div>
 
 <p>5. Add Scala and its log jars.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">cp 
/tmp/scala-2.10.3/lib/scala-compiler.jar <span 
class="nv">$HADOOP_YARN_HOME</span>/share/hadoop/hdfs/lib
+<div class="highlight"><pre><code class="bash">cp 
/tmp/scala-2.10.3/lib/scala-compiler.jar <span 
class="nv">$HADOOP_YARN_HOME</span>/share/hadoop/hdfs/lib
 cp /tmp/scala-2.10.3/lib/scala-library.jar <span 
class="nv">$HADOOP_YARN_HOME</span>/share/hadoop/hdfs/lib
 curl http://search.maven.org/remotecontent?filepath<span 
class="o">=</span>org/clapper/grizzled-slf4j_2.10/1.0.1/grizzled-slf4j_2.10-1.0.1.jar
 &gt; <span 
class="nv">$HADOOP_YARN_HOME</span>/share/hadoop/hdfs/lib/grizzled-slf4j_2.10-1.0.1.jar</code></pre></div>
 
 <p>6. Add http configuration in core-site.xml (create the core-site.xml file 
and add content).</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml">vi 
$HADOOP_YARN_HOME/conf/core-site.xml</code></pre></div>
+<div class="highlight"><pre><code class="xml">vi 
$HADOOP_YARN_HOME/conf/core-site.xml</code></pre></div>
 
 <p>Add the following code:</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span 
class="cp">&lt;?xml-stylesheet type=&quot;text/xsl&quot; 
href=&quot;configuration.xsl&quot;?&gt;</span>
+<div class="highlight"><pre><code class="xml"><span 
class="cp">&lt;?xml-stylesheet type=&quot;text/xsl&quot; 
href=&quot;configuration.xsl&quot;?&gt;</span>
 <span class="nt">&lt;configuration&gt;</span>
     <span class="nt">&lt;property&gt;</span>
       <span class="nt">&lt;name&gt;</span>fs.http.impl<span 
class="nt">&lt;/name&gt;</span>
@@ -193,7 +193,7 @@ curl http://search.maven.org/remoteconte
 
 <p>7. Basically, you copy the hadoop file in your host machine to slave 
machines. (172.21.100.35, in my case):</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">scp 
-r . 172.21.100.35:/tmp/hadoop-2.3.0
+<div class="highlight"><pre><code class="bash">scp -r . 
172.21.100.35:/tmp/hadoop-2.3.0
 <span class="nb">echo </span>172.21.100.35 &gt; conf/slaves
 sbin/start-yarn.sh</code></pre></div>
 
@@ -209,7 +209,7 @@ sbin/start-yarn.sh</code></pre></div>
 
 <p>1. Download Samza and publish it to Maven local repository.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash"><span 
class="nb">cd</span> /tmp
+<div class="highlight"><pre><code class="bash"><span class="nb">cd</span> /tmp
 git clone http://git-wip-us.apache.org/repos/asf/incubator-samza.git
 <span class="nb">cd </span>incubator-samza
 ./gradlew clean publishToMavenLocal
@@ -217,17 +217,17 @@ git clone http://git-wip-us.apache.org/r
 
 <p>2. Download hello-samza project and change the job properties file.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git 
clone git://github.com/linkedin/hello-samza.git
+<div class="highlight"><pre><code class="bash">git clone 
git://github.com/linkedin/hello-samza.git
 <span class="nb">cd </span>hello-samza
 vi 
samza-job-package/src/main/config/wikipedia-feed.properties</code></pre></div>
 
 <p>Change the yarn.package.path property to be:</p>
 
-<div class="highlight"><pre><code class="language-jproperties" 
data-lang="jproperties"><span class="na">yarn.package.path</span><span 
class="o">=</span><span 
class="s">http://yourHostname:8000/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz</span></code></pre></div>
+<div class="highlight"><pre><code class="jproperties"><span 
class="na">yarn.package.path</span><span class="o">=</span><span 
class="s">http://yourHostname:8000/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz</span></code></pre></div>
 
 <p>3. Complie hello-samza.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">mvn 
clean package
+<div class="highlight"><pre><code class="bash">mvn clean package
 mkdir -p deploy/samza
 tar -xvf ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz -C 
deploy/samza</code></pre></div>
 
@@ -235,11 +235,11 @@ tar -xvf ./samza-job-package/target/samz
 
 <p>Open a new terminal, and run:</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash"><span 
class="nb">cd</span> /tmp/hello-samza <span class="o">&amp;&amp;</span> python 
-m SimpleHTTPServer</code></pre></div>
+<div class="highlight"><pre><code class="bash"><span class="nb">cd</span> 
/tmp/hello-samza <span class="o">&amp;&amp;</span> python -m 
SimpleHTTPServer</code></pre></div>
 
 <p>Go back to the original terminal (not the one running the HTTP server):</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/samza/bin/run-job.sh --config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-feed.properties</code></pre></div>
+<div class="highlight"><pre><code class="bash">deploy/samza/bin/run-job.sh 
--config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-feed.properties</code></pre></div>
 
 <p>Go to http://yourHostname:8088 and find the wikipedia-feed job. Click on 
the ApplicationMaster link to see that it&rsquo;s running.</p>
 

Modified: incubator/samza/site/sitemap.xml
URL: 
http://svn.apache.org/viewvc/incubator/samza/site/sitemap.xml?rev=1612998&r1=1612997&r2=1612998&view=diff
==============================================================================
--- incubator/samza/site/sitemap.xml (original)
+++ incubator/samza/site/sitemap.xml Thu Jul 24 05:05:00 2014
@@ -20,7 +20,7 @@
 
   <url>
     <loc>http://samza.incubator.apache.org/</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     <changefreq>daily</changefreq>
     <priority>1.0</priority>
   </url>
@@ -30,308 +30,315 @@
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/yarn/application-master.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/introduction/architecture.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/introduction/background.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/checkpointing.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/code.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/coding-guide.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/community/committers.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/introduction/concepts.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/configuration.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/disclaimer.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/event-loop.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/index.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/tutorials/0.7.0/index.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/index.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/startup/download/index.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/startup/hello-samza/0.7.0/index.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/introduction.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/community/irc.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/yarn/isolation.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/jmx.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/job-runner.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/operations/kafka.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/logging.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/community/mailing-lists.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/metrics.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/mupd8.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/api/overview.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/packaging.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/projects.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/tutorials/0.7.0/remote-debugging-samza.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/reprocessing.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/rules.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-hello-samza-without-internet.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/samza-container.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/operations/security.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     <loc>http://samza.incubator.apache.org/contribute/seps.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/serialization.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
+    
+    
+  </url>
+  
+  <url>
+    
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/spark-streaming.html</loc>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/state-management.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/storm.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/streams.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/container/windowing.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>
   
   <url>
     
<loc>http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/yarn-jobs.html</loc>
-    <lastmod>2014-07-09</lastmod>
+    <lastmod>2014-07-23</lastmod>
     
     
   </url>

Modified: incubator/samza/site/startup/download/index.html
URL: 
http://svn.apache.org/viewvc/incubator/samza/site/startup/download/index.html?rev=1612998&r1=1612997&r2=1612998&view=diff
==============================================================================
--- incubator/samza/site/startup/download/index.html (original)
+++ incubator/samza/site/startup/download/index.html Thu Jul 24 05:05:00 2014
@@ -141,7 +141,7 @@
 
 <p>A Maven-based Samza project can pull in all required dependencies Samza 
dependencies this XML block:</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span 
class="nt">&lt;dependency&gt;</span>
+<div class="highlight"><pre><code class="xml"><span 
class="nt">&lt;dependency&gt;</span>
   <span class="nt">&lt;groupId&gt;</span>org.apache.samza<span 
class="nt">&lt;/groupId&gt;</span>
   <span class="nt">&lt;artifactId&gt;</span>samza-api<span 
class="nt">&lt;/artifactId&gt;</span>
   <span class="nt">&lt;version&gt;</span>0.7.0<span 
class="nt">&lt;/version&gt;</span>
@@ -190,14 +190,14 @@
 
 <p>Samza is available in the Apache Maven repository.</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span 
class="nt">&lt;repository&gt;</span>
+<div class="highlight"><pre><code class="xml"><span 
class="nt">&lt;repository&gt;</span>
   <span class="nt">&lt;id&gt;</span>apache-releases<span 
class="nt">&lt;/id&gt;</span>
   <span 
class="nt">&lt;url&gt;</span>https://repository.apache.org/content/groups/public<span
 class="nt">&lt;/url&gt;</span>
 <span class="nt">&lt;/repository&gt;</span></code></pre></div>
 
 <p>Snapshot builds are available in the Apache Maven snapshot repository.</p>
 
-<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span 
class="nt">&lt;repository&gt;</span>
+<div class="highlight"><pre><code class="xml"><span 
class="nt">&lt;repository&gt;</span>
   <span class="nt">&lt;id&gt;</span>apache-snapshots<span 
class="nt">&lt;/id&gt;</span>
   <span 
class="nt">&lt;url&gt;</span>https://repository.apache.org/content/groups/snapshots<span
 class="nt">&lt;/url&gt;</span>
 <span class="nt">&lt;/repository&gt;</span></code></pre></div>
@@ -206,7 +206,7 @@
 
 <p>If you&rsquo;re interested in working on Samza, or building the JARs from 
scratch, then you&rsquo;ll need to checkout and build the code. Samza does not 
have a binary release at this time. To check out and build Samza, run these 
commands.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git 
clone http://git-wip-us.apache.org/repos/asf/incubator-samza.git
+<div class="highlight"><pre><code class="bash">git clone 
http://git-wip-us.apache.org/repos/asf/incubator-samza.git
 <span class="nb">cd </span>incubator-samza
 ./gradlew clean build</code></pre></div>
 

Modified: incubator/samza/site/startup/hello-samza/0.7.0/index.html
URL: 
http://svn.apache.org/viewvc/incubator/samza/site/startup/hello-samza/0.7.0/index.html?rev=1612998&r1=1612997&r2=1612998&view=diff
==============================================================================
--- incubator/samza/site/startup/hello-samza/0.7.0/index.html (original)
+++ incubator/samza/site/startup/hello-samza/0.7.0/index.html Thu Jul 24 
05:05:00 2014
@@ -129,7 +129,7 @@
 
 <p>Check out the hello-samza project:</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git 
clone git://git.apache.org/incubator-samza-hello-samza.git hello-samza
+<div class="highlight"><pre><code class="bash">git clone 
git://git.apache.org/incubator-samza-hello-samza.git hello-samza
 <span class="nb">cd </span>hello-samza</code></pre></div>
 
 <p>This project contains everything you&rsquo;ll need to run your first Samza 
jobs.</p>
@@ -138,7 +138,7 @@
 
 <p>A Samza grid usually comprises three different systems: <a 
href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html";>YARN</a>,
 <a href="http://kafka.apache.org/";>Kafka</a>, and <a 
href="http://zookeeper.apache.org/";>ZooKeeper</a>. The hello-samza project 
comes with a script called &ldquo;grid&rdquo; to help you setup these systems. 
Start by running:</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">bin/grid bootstrap</code></pre></div>
+<div class="highlight"><pre><code class="bash">bin/grid 
bootstrap</code></pre></div>
 
 <p>This command will download, install, and start ZooKeeper, Kafka, and YARN. 
It will also check out the latest version of Samza and build it. All package 
files will be put in a sub-directory called &ldquo;deploy&rdquo; inside 
hello-samza&rsquo;s root folder.</p>
 
@@ -150,7 +150,7 @@
 
 <p>Before you can run a Samza job, you need to build a package for it. This 
package is what YARN uses to deploy your jobs on the grid.</p>
 
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">mvn 
clean package
+<div class="highlight"><pre><code class="bash">mvn clean package
 mkdir -p deploy/samza
 tar -xvf ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz -C 
deploy/samza</code></pre></div>
 
@@ -158,11 +158,11 @@ tar -xvf ./samza-job-package/target/samz
 
 <p>After you&rsquo;ve built your Samza package, you can start a job on the 
grid using the run-job.sh script.</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/samza/bin/run-job.sh --config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-feed.properties</code></pre></div>
+<div class="highlight"><pre><code class="bash">deploy/samza/bin/run-job.sh 
--config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-feed.properties</code></pre></div>
 
 <p>The job will consume a feed of real-time edits from Wikipedia, and produce 
them to a Kafka topic called &ldquo;wikipedia-raw&rdquo;. Give the job a minute 
to startup, and then tail the Kafka topic:</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-raw</code></pre></div>
+<div class="highlight"><pre><code 
class="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-raw</code></pre></div>
 
 <p>Pretty neat, right? Now, check out the YARN UI again (<a 
href="http://localhost:8088";>http://localhost:8088</a>). This time around, 
you&rsquo;ll see your Samza job is running!</p>
 
@@ -172,20 +172,20 @@ tar -xvf ./samza-job-package/target/samz
 
 <p>Let&rsquo;s calculate some statistics based on the messages in the 
wikipedia-raw topic. Start two more jobs:</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/samza/bin/run-job.sh --config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-parser.properties
+<div class="highlight"><pre><code class="bash">deploy/samza/bin/run-job.sh 
--config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-parser.properties
 deploy/samza/bin/run-job.sh --config-factory<span 
class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path<span class="o">=</span>file://<span 
class="nv">$PWD</span>/deploy/samza/config/wikipedia-stats.properties</code></pre></div>
 
 <p>The first job (wikipedia-parser) parses the messages in wikipedia-raw, and 
extracts information about the size of the edit, who made the change, etc. You 
can take a look at its output with:</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-edits</code></pre></div>
+<div class="highlight"><pre><code 
class="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-edits</code></pre></div>
 
 <p>The last job (wikipedia-stats) reads messages from the wikipedia-edits 
topic, and calculates counts, every ten seconds, for all edits that were made 
during that window. It outputs these counts to the wikipedia-stats topic.</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-stats</code></pre></div>
+<div class="highlight"><pre><code 
class="bash">deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wikipedia-stats</code></pre></div>
 
 <p>The messages in the stats topic look like this:</p>
 
-<div class="highlight"><pre><code class="language-json" data-lang="json"><span 
class="p">{</span><span class="nt">&quot;is-talk&quot;</span><span 
class="p">:</span><span class="mi">2</span><span class="p">,</span><span 
class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span 
class="mi">5276</span><span class="p">,</span><span 
class="nt">&quot;edits&quot;</span><span class="p">:</span><span 
class="mi">13</span><span class="p">,</span><span 
class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span 
class="mi">13</span><span class="p">}</span>
+<div class="highlight"><pre><code class="json"><span class="p">{</span><span 
class="nt">&quot;is-talk&quot;</span><span class="p">:</span><span 
class="mi">2</span><span class="p">,</span><span 
class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span 
class="mi">5276</span><span class="p">,</span><span 
class="nt">&quot;edits&quot;</span><span class="p">:</span><span 
class="mi">13</span><span class="p">,</span><span 
class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span 
class="mi">13</span><span class="p">}</span>
 <span class="p">{</span><span class="nt">&quot;is-bot-edit&quot;</span><span 
class="p">:</span><span class="mi">1</span><span class="p">,</span><span 
class="nt">&quot;is-talk&quot;</span><span class="p">:</span><span 
class="mi">3</span><span class="p">,</span><span 
class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span 
class="mi">4211</span><span class="p">,</span><span 
class="nt">&quot;edits&quot;</span><span class="p">:</span><span 
class="mi">30</span><span class="p">,</span><span 
class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span 
class="mi">30</span><span class="p">,</span><span 
class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span 
class="mi">1</span><span class="p">,</span><span 
class="nt">&quot;is-new&quot;</span><span class="p">:</span><span 
class="mi">2</span><span class="p">,</span><span 
class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span 
class="mi">7</span><span class="p">}</span>
 <span class="p">{</span><span class="nt">&quot;bytes-added&quot;</span><span 
class="p">:</span><span class="mi">3180</span><span class="p">,</span><span 
class="nt">&quot;edits&quot;</span><span class="p">:</span><span 
class="mi">19</span><span class="p">,</span><span 
class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span 
class="mi">19</span><span class="p">,</span><span 
class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span 
class="mi">1</span><span class="p">,</span><span 
class="nt">&quot;is-new&quot;</span><span class="p">:</span><span 
class="mi">1</span><span class="p">,</span><span 
class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span 
class="mi">3</span><span class="p">}</span>
 <span class="p">{</span><span class="nt">&quot;bytes-added&quot;</span><span 
class="p">:</span><span class="mi">2218</span><span class="p">,</span><span 
class="nt">&quot;edits&quot;</span><span class="p">:</span><span 
class="mi">18</span><span class="p">,</span><span 
class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span 
class="mi">18</span><span class="p">,</span><span 
class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span 
class="mi">2</span><span class="p">,</span><span 
class="nt">&quot;is-new&quot;</span><span class="p">:</span><span 
class="mi">2</span><span class="p">,</span><span 
class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span 
class="mi">3</span><span class="p">}</span></code></pre></div>
@@ -196,7 +196,7 @@ deploy/samza/bin/run-job.sh --config-fac
 
 <p>After you&rsquo;re done, you can clean everything up using the same grid 
script.</p>
 
-<div class="highlight"><pre><code class="language-bash" 
data-lang="bash">bin/grid stop all</code></pre></div>
+<div class="highlight"><pre><code class="bash">bin/grid stop 
all</code></pre></div>
 
 <p>Congratulations! You&rsquo;ve now setup a local grid that includes YARN, 
Kafka, and ZooKeeper, and run a Samza job on it. Next up, check out the <a 
href="/learn/documentation/0.7.0/introduction/background.html">Background</a> 
and <a href="/learn/documentation/0.7.0/api/overview.html">API Overview</a> 
pages.</p>
 


Reply via email to