Hi Claudio,
I've added [email protected] since
[email protected] is now deprecated with the move to the
Apache Incubator. Thank you for your questions. I've inlined responses
below.
Avery
On 9/9/11 3:16 AM, Claudio Martella wrote:
Hello,
I'm setting up my first Giraph application. The idea up to know has
been to use a Text as a vertex id (basically my vertices are RDF
nodes, therefore i'd like using the URIs as IDs), so I've defined
MyVertex as extending Vertex<Text, IntWritable, Text, MyMessage>. In
reality I'd like to avoid the IntWritable at all, as I don't really
need to keep any state within the vertex, for now, but that's ok.
This is all fine until I write the VertexReader
as TextVertexReader<Text, IntWritable, Text> and define the
constructor as:
public GraffitiVertexReader(RecordReader<Text, Text> arg) { super(arg); }
Apparently VertexReader requires RecordReader<LongWritable, Text>. Is
it mandatory to use a LongWritable?
TextVertexInputFormat uses TextInputFormat (Hadoop provided)
internally. And TextInputFormat extends FileInputFormat<LongWritable,
Text>. So the RecordReader will have to use the corresponding types.
LongWritable has the position of the file and the Text is the line of
text. If you decide to extend TextVertexInputFormat (good idea), you
should look at JsonBase64VertexInputFormat as an example. Your graph
types need not match the RecordReader<LongWritable, Text> types since it
will be stored internally of your record reader.
So the vertex has to be a long? I don't understand the mismatch
between the VertexReader constructor and the Vertex definition.
Not at all, see above =). Let me know if it is still unclear.
Thanks!
--
Claudio Martella
[email protected] <mailto:[email protected]>