RE: How do I output only a subset of a graph?

2014-08-25 Thread Schweiger, Tom

I think you answered your question "Or am I supposed to write a 
VertexOutputFormat implementation that generates no output for the vertices 
that have no data?", as in YES!.

But don't be put off; It is actually a very simple class to override.  Here is 
an example for something like you describe:


package com.ebay.foo.bar.giraph.io.formats;

import org.apache.giraph.graph.Vertex;
import org.apache.giraph.io.formats.TextVertexOutputFormat;
import org.apache.hadoop.io.BooleanWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

import java.io.IOException;

public class ExampleOutputFormat extends
TextVertexOutputFormat {

public class ExampleWriter extends TextVertexWriter {

@Override
public void writeVertex(
Vertex vertex)
throws IOException, InterruptedException {
if (!vertex.getValue().toString().isEmpty())
getRecordWriter().write(vertex.getId(), vertex.getValue());
}
}

}

@Override
public TextVertexWriter createVertexWriter(TaskAttemptContext context)
throws IOException, InterruptedException {
return new ExampleWriter();
}

}



Thomas A J Schweiger
Sr. Software Architect
GDI-Inc Data Services-Seattle

[X]
Office: (425) 586-2669
email: thschwei...@ebay.com

From: matthewcorn...@gmail.com [matthewcorn...@gmail.com] on behalf of Matthew 
Cornell [m...@matthewcornell.org]
Sent: Monday, August 25, 2014 11:38 AM
To: user
Subject: How do I output only a subset of a graph?

Hi Folks. I have a graph computation that starts with a subset of vertices of a 
certain type and propagates information through the graph to a set of target 
vertices, which are also subset of the graph. I want to output only information 
from those particular vertices, but I don't see a way to do this in the various 
VertexOutputFormat subclasses, which all seem oriented to outputting something 
for every vertex in the graph. How do I do this? E.g., are there hooks for the 
output phase where I can filter output? Or am I supposed to write a 
VertexOutputFormat implementation that generates no output for the vertices 
that have no data? Thanks in advance.

--
Matthew Cornell | m...@matthewcornell.org | 
413-626-3621 | 34 Dickinson Street, Amherst MA 01002 | 
matthewcornell.org


How do I look up a Vertex using its ID?

2014-08-25 Thread Matthew Cornell
Hi Folks. I have a graph computation that passes 'visited' Vertex IDs
around, and I need to output information from those in the output phase.
How do I look up a Vertex from its ID? I found Partition.getVertex(), but
IIUC there is no guarantee that an arbitrary Vertex will be in a particular
partition. Thanks in advance.

-- 
Matthew Cornell | m...@matthewcornell.org | 413-626-3621 | 34 Dickinson
Street, Amherst MA 01002 | matthewcornell.org


How do I output only a subset of a graph?

2014-08-25 Thread Matthew Cornell
Hi Folks. I have a graph computation that starts with a subset of vertices
of a certain type and propagates information through the graph to a set of
target vertices, which are also subset of the graph. I want to output only
information from those particular vertices, but I don't see a way to do
this in the various VertexOutputFormat subclasses, which all seem oriented
to outputting something for every vertex in the graph. How do I do this?
E.g., are there hooks for the output phase where I can filter output? Or am
I supposed to write a VertexOutputFormat implementation that generates no
output for the vertices that have no data? Thanks in advance.

-- 
Matthew Cornell | m...@matthewcornell.org | 413-626-3621 | 34 Dickinson
Street, Amherst MA 01002 | matthewcornell.org