I think you answered your question "Or am I supposed to write a VertexOutputFormat implementation that generates no output for the vertices that have no data?", as in YES!.
But don't be put off; It is actually a very simple class to override. Here is an example for something like you describe: package com.ebay.foo.bar.giraph.io.formats; import org.apache.giraph.graph.Vertex; import org.apache.giraph.io.formats.TextVertexOutputFormat; import org.apache.hadoop.io.BooleanWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.TaskAttemptContext; import java.io.IOException; public class ExampleOutputFormat extends TextVertexOutputFormat<Text, Text, BooleanWritable> { public class ExampleWriter extends TextVertexWriter { @Override public void writeVertex( Vertex<Text, Text, BooleanWritable> vertex) throws IOException, InterruptedException { if (!vertex.getValue().toString().isEmpty()) getRecordWriter().write(vertex.getId(), vertex.getValue()); } } } @Override public TextVertexWriter createVertexWriter(TaskAttemptContext context) throws IOException, InterruptedException { return new ExampleWriter(); } } Thomas A J Schweiger Sr. Software Architect GDI-Inc Data Services-Seattle [X] Office: (425) 586-2669 email: thschwei...@ebay.com<mailto:thschwei...@ebay.com> ________________________________ From: matthewcorn...@gmail.com [matthewcorn...@gmail.com] on behalf of Matthew Cornell [m...@matthewcornell.org] Sent: Monday, August 25, 2014 11:38 AM To: user Subject: How do I output only a subset of a graph? Hi Folks. I have a graph computation that starts with a subset of vertices of a certain type and propagates information through the graph to a set of target vertices, which are also subset of the graph. I want to output only information from those particular vertices, but I don't see a way to do this in the various VertexOutputFormat subclasses, which all seem oriented to outputting something for every vertex in the graph. How do I do this? E.g., are there hooks for the output phase where I can filter output? Or am I supposed to write a VertexOutputFormat implementation that generates no output for the vertices that have no data? Thanks in advance. -- Matthew Cornell | m...@matthewcornell.org<mailto:m...@matthewcornell.org> | 413-626-3621 | 34 Dickinson Street, Amherst MA 01002 | matthewcornell.org<http://matthewcornell.org>