Updated Branches: refs/heads/trunk 4a8b5c3f1 -> 4dd605a3a
GIRAPH-749: Update documentation according to the new EdgeOutputFormat API Project: http://git-wip-us.apache.org/repos/asf/giraph/repo Commit: http://git-wip-us.apache.org/repos/asf/giraph/commit/0be911bc Tree: http://git-wip-us.apache.org/repos/asf/giraph/tree/0be911bc Diff: http://git-wip-us.apache.org/repos/asf/giraph/diff/0be911bc Branch: refs/heads/trunk Commit: 0be911bc430d736befce5b3a272be485e835460c Parents: 4a8b5c3 Author: Claudio Martella <[email protected]> Authored: Wed Nov 13 01:06:15 2013 +0100 Committer: Claudio Martella <[email protected]> Committed: Wed Nov 13 01:06:15 2013 +0100 ---------------------------------------------------------------------- src/site/xdoc/io.xml | 14 +++++++++++++- src/site/xdoc/quick_start.xml | 18 ++++++++++++++---- 2 files changed, 27 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/giraph/blob/0be911bc/src/site/xdoc/io.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/io.xml b/src/site/xdoc/io.xml index 38c74e0..6276de1 100644 --- a/src/site/xdoc/io.xml +++ b/src/site/xdoc/io.xml @@ -48,7 +48,7 @@ To summarize, <code>VertexInputFormat</code> is usually used by itself, whereas <code>EdgeInputFormat</code> may be used in combination with <code>VertexValueInputFormat</code>. </p> <p> - Output is always done on a per-vertex basis: a <code>VertexOutputFormat</code> will specify what data to write for each vertex. This usually means (some function of) the vertex value, but nothing prevents us from writing back the edges instead. + Output can be done both on a per-vertex and a per-edge basis: a <code>VertexOutputFormat</code> will specify what data to write for each vertex while <code>EdgeOutputFormat</code> will specify what data to write for each edge. This usually means (some function of) the vertex value, but nothing prevents us from writing back the edges instead. </p> <p> Let's have a quick look at the base classes: @@ -71,6 +71,18 @@ <li> <code>EdgeReader<I, E></code>: the main methods are <code>getCurrentSourceId()</code>, which returns the source vertex id, and <code>getCurrentEdge()</code>, which returns an <code>Edge<I, E></code> (i.e., the target vertex id, possibly with an edge value). </li> + <li> + <code>VertexOutputFormat<I, V, E></code>: modeled based on the Hadoop <code>OutputFormat</code> class, this class is intended for output vertices and related edges after the computation. The <code>createVertexWriter</code> returns a <code>VertexWriter</code> to save the vertices. Additionally <code>getOutputCommiter</code> returns an <code>OutputCommiter</code> used to guarantee that the output process is correctly committed and <code>checkOutputSpecs</code> is used to check that the correct setup before running the computation. + </li> + <li> + <code>VertexWriter<I, V, E></code>: this is where the user defines how to write vertices and possibly edges. The infrastructure just provides an <code>initialize</code> and a <code>close</code> method to deal with the initial and final part of the output. It also inherits <code>SimpleVertexWriter#writeVertex</code> which is the main function used to actually save the vertices. + </li> + <li> + <code>EdgeOutputFormat<I, V, E></code>: modeled similar to <code>VertexOutputFormat</code>, this class is intended for output edges after the computation. The <code>createEdgeWriter</code> returns a <code>EdgeWriter</code> to save the edges. Additionally <code>getOutputCommiter</code> returns an <code>OutputCommiter</code> used to guarantee that the output process is correctly committed and <code>checkOutputSpecs</code> is used to check that the correct setup before running the computation. + </li> + <li> + <code>EdgeWriter<I, V, E></code>: this class is similar to <code>VertexWriter</code> providing initialization and closing facilities. It is inteded to save edges and the main function that needs to be extended by the user for such purpose is <code>writeEdge</code>. + </li> </ul> </p> </section> http://git-wip-us.apache.org/repos/asf/giraph/blob/0be911bc/src/site/xdoc/quick_start.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/quick_start.xml b/src/site/xdoc/quick_start.xml index d704cba..cdef668 100644 --- a/src/site/xdoc/quick_start.xml +++ b/src/site/xdoc/quick_start.xml @@ -218,9 +218,10 @@ $HADOOP_HOME/bin/hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples- <p>This will output the following:</p> <source> usage: org.apache.giraph.utils.ConfigurationUtils [-aw <arg>] [-c <arg>] - [-ca <arg>] [-cf <arg>] [-eif <arg>] [-eip <arg>] [-h] [-la] [-mc - <arg>] [-vof <arg>] [-op <arg>] [-pc <arg>] [-q] [-ve <arg>] [-vif - <arg>] [-vip <arg>] [-vvf <arg>] [-w <arg>] [-wc <arg>] [-yh <arg>] + [-ca <arg>] [-cf <arg>] [-eif <arg>] [-eip <arg>] [-eof <arg>] + [-esd <arg>] [-h] [-jyc <arg>] [-la] [-mc <arg>] [-op <arg>] [-pc + <arg>] [-q] [-th <arg>] [-ve <arg>] [-vif <arg>] [-vip <arg>] [-vof + <arg>] [-vsd <arg>] [-vvf <arg>] [-w <arg>] [-wc <arg>] [-yh <arg>] [-yj <arg>] -aw,--aggregatorWriter <arg> AggregatorWriter class -c,--messageCombiner <arg> Message messageCombiner class @@ -234,16 +235,25 @@ usage: org.apache.giraph.utils.ConfigurationUtils [-aw <arg>] [-c <arg& -eif,--edgeInputFormat <arg> Edge input format -eip,--edgeInputPath <arg> Edge input path -eof,--vertexOutputFormat <arg> Edge output format + -esd,--edgeSubDir <arg> subdirectory to be used for the + edge output -h,--help Help + -jyc,--jythonClass <arg> Jython class name, used if + computation passed in is a python + script -la,--listAlgorithms List supported algorithms -mc,--masterCompute <arg> MasterCompute class - -vof,--vertexOutputFormat <arg> Vertex output format -op,--outputPath <arg> Vertex output path -pc,--partitionClass <arg> Partition class -q,--quiet Quiet output + -th,--typesHolder <arg> Class that holds types. Needed + only if Computation is not set -ve,--outEdges <arg> Vertex edges class -vif,--vertexInputFormat <arg> Vertex input format -vip,--vertexInputPath <arg> Vertex input path + -vof,--vertexOutputFormat <arg> Vertex output format + -vsd,--vertexSubDir <arg> subdirectory to be used for the + vertex output -vvf,--vertexValueFactoryClass <arg> Vertex value factory class -w,--workers <arg> Number of workers -wc,--workerContext <arg> WorkerContext class
