Author: buildbot
Date: Fri Dec 14 17:15:01 2012
New Revision: 842355
Log:
Staging update by buildbot for crunch
Modified:
websites/staging/crunch/trunk/content/ (props changed)
websites/staging/crunch/trunk/content/crunch/index.html
websites/staging/crunch/trunk/content/crunch/intro.html
websites/staging/crunch/trunk/content/crunch/pipelines.html
Propchange: websites/staging/crunch/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Dec 14 17:15:01 2012
@@ -1 +1 @@
-1421632
+1421980
Modified: websites/staging/crunch/trunk/content/crunch/index.html
==============================================================================
--- websites/staging/crunch/trunk/content/crunch/index.html (original)
+++ websites/staging/crunch/trunk/content/crunch/index.html Fri Dec 14 17:15:01
2012
@@ -7,7 +7,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="Content-Language" content="en" />
- <title>Apache Crunch - Apache Crunch &trade;</title>
+ <title>Apache Crunch - Apache Crunchâ¢</title>
<link rel="stylesheet" href="/crunch/css/bootstrap-2.1.0.min.css" />
<link rel="stylesheet" href="/crunch/css/crunch.css" type="text/css">
@@ -115,7 +115,7 @@
<!-- CONTENT AREA -->
<div class="span10">
<h1 class="title">
- Apache Crunch &trade;
+ Apache Crunchâ¢
<small>Simple and Efficient MapReduce Pipelines</small>
@@ -140,6 +140,7 @@ includes a REPL (read-eval-print loop) f
<ul>
<li><a href="intro.html">Introduction to the Apache Crunch API</a></li>
<li><a href="scrunch.html">Introduction to the Scrunch API</a></li>
+<li><a href="pipelines.html">Writing Your Own Pipelines</a></li>
<li><a href="future-work.html">Current Limitations and Future Work</a></li>
</ul>
<h2 id="disclaimer">Disclaimer</h2>
Modified: websites/staging/crunch/trunk/content/crunch/intro.html
==============================================================================
--- websites/staging/crunch/trunk/content/crunch/intro.html (original)
+++ websites/staging/crunch/trunk/content/crunch/intro.html Fri Dec 14 17:15:01
2012
@@ -120,9 +120,9 @@
</h1>
<h2 id="build-and-installation">Build and Installation</h2>
-<p>You can download the most recently released libraries from the <a
href="download.html">Download</a> page or from the Maven
+<p>You can download the most recently released Crunch libraries from the <a
href="download.html">Download</a> page or from the Maven
Central Repository.</p>
-<p>If you prefer, you can also build the libraries from the source code using
Maven and install
+<p>If you prefer, you can also build the Crunch libraries from the source code
using Maven and install
it in your local repository:</p>
<div class="codehilite"><pre><span class="n">mvn</span> <span
class="n">clean</span> <span class="n">install</span>
</pre></div>
@@ -155,7 +155,7 @@ them as a single, virtual PCollection. T
joins.</p>
<h3 id="pipeline-building-and-execution">Pipeline Building and Execution</h3>
<p>Every pipeline starts with a <code>Pipeline</code> object that is used to
coordinate building the pipeline and executing the underlying MapReduce
-jobs. For efficiency, the library uses lazy evaluation, so it will only
construct MapReduce jobs from the different stages of the pipelines when
+jobs. For efficiency, the Crunch planner uses lazy evaluation, so it will only
construct MapReduce jobs from the different stages of the pipelines when
the Pipeline object's <code>run</code> or <code>done</code> methods are
called.</p>
<h2 id="a-detailed-example">A Detailed Example</h2>
<p>Here is the classic WordCount application using the APIs:</p>
@@ -202,7 +202,7 @@ via the <code>SequenceFileSource</code>
<p>Note that each PCollection is a <em>reference</em> to a source of data- no
data is actually loaded into a
PCollection on the client machine.</p>
<h3 id="step-2-splitting-the-lines-of-text-into-words">Step 2: Splitting the
lines of text into words</h3>
-<p>The library defines a small set of primitive operations that can be
composed in order to build complex data
+<p>The Crunch library defines a small set of primitive operations that can be
composed in order to build complex data
pipelines. The first of these primitives is the <code>parallelDo</code>
function, which applies a function (defined
by a subclass of <code>DoFn</code>) to every record in a PCollection, and
returns a new PCollection that contains
the results.</p>
Modified: websites/staging/crunch/trunk/content/crunch/pipelines.html
==============================================================================
--- websites/staging/crunch/trunk/content/crunch/pipelines.html (original)
+++ websites/staging/crunch/trunk/content/crunch/pipelines.html Fri Dec 14
17:15:01 2012
@@ -119,7 +119,7 @@
</h1>
- <p>This section discusses the different steps of creating your own
pipelines in more detail.</p>
+ <p>This section discusses the different steps of creating your own
Crunch pipelines in more detail.</p>
<h2 id="writing-a-dofn">Writing a DoFn</h2>
<p>The DoFn class is designed to keep the complexity of the MapReduce APIs out
of your way when you
don't need them while still keeping them accessible when you do.</p>