I don’t see how i could possibly pass in the input file twice. I am simple
using GiraphRunner and specifying -vif and -vip,
the path only contains the file once.
I do agree though that it is being read in as two identical input splits. In
fact you see this line
hdfs://arcus1.silverdale.dev/t
-giraph-example/1390935731/input/sm
all:0+172'
Have you by any chance accidentally passed in the input file twice?
Rob
From: Eric Kimbrel
Reply-To:
Date: Wednesday, 29 January 2014 09:08
To:
Subject: duplicate edges created with TextVertexInputFormat
> I am reading in an adjacency lis
I am reading in an adjacency list using an input format which extends
TextVertexInputFormat. My code doesn’t do anything to address input splits,
but leaves that to the underlying giraph implementation. However it appears
that as the data is being read 2 identical input splits are created and