Hi, I'm just getting started with Giraph, and struggling a bit to understand what exactly is needed to run a minimal Giraph computation on real data, rather than the PseudoRandomVertexInputFormat.
Apologies if this is covered somewhere in the docs or mailing list archives. I looked but couldn't find anything applying to the current version, and I couldn't figure out exactly how things have changed through the versions. Some older code that I tried was clearly incompatible with the current version. Trying to learn by example, I copied the current o.a.g.benchmark.ShortestPathsBenchmark and o.a.g.benchmark.ShortestPathsComputation into my own project, and modified them to run on their own without GiraphBenchmark, and BenchmarkOption. Here is the new ShortestPathsBenchmark I ended up with: http://pastebin.com/h3rH6jTm When using the PseudoRandomVertexInputFormat, and some hard coded options for aggregateVertices and edgesPerVertex, this runs fine from my jar with the command: hadoop jar giraph-testing-jar-with-dependencies.jar modified_benchmarks.ShortestPathsBenchmark --workers 10 Now I'd like to use JsonLongDoubleFloatDoubleVertexInputFormat with some real data, but I see no way to specify the input path. If this was plain hadoop, I'd expect to be able to say something like JsonLongDoubleFloatDoubleVertexInputFormat.addInputPath(job, new Path("/some/path")); That's not available though. Could someone point me in the right direction with this? Am I going about this all wrong? Thanks for any help, Matt