Your best bet is to look over the two code components that users most often have to tweak or implement to write application code. That is, the Vertex implementations in examples/ and benchmark/ and the IO formats and related goodies like RecordReaders etc. that are mostly in the io/ dir. You might also take a look at the test suite for some quick ideas of how some of the moving parts fit together.
If you have real work to do with Giraph, you're going to need to get used to 0.2 and its API. The old API is both limited in what kind of data it will process, and not compatible into the future. The API we have now, while evolving, is much much closer to being "final" than anything in 0.1 And regardless, we now have (in hindsight) the sure knowledge that none of the code you write for 0.1 will be portable into the future. I am first in line to be sorry about the state of the docs. There are efforts underway now to fix this. We all owe the users a collective apology for this. In lieu of proper apologies, feel free to ask any and all questions, no matter how dumb, they can't be as dumb as mine! The codebase is under heavy development and has a lot of confusingly-named moving parts so first get used to the plumbing an app writer has to know to function, get some apps up and running, then dig into the framework code and it will make more sense. One string to pull on to begin to look inside the framework is bin/giraph -> org.apache.giraph.GiraphRunner (hands job to Hadoop) -> ... -> o.a.g.graph.GraphMapper (is a mapper instance on a Hadoop cluster, started according to the Job sumbitted to Hadoop, but running our BSP code instead) -> o.a.g.graph.GraphTaskManager -> lots of places from there... The overarching BSP activity management for a single job run is basically all stemming out of GraphTaskManager now. You can look at setup() and execute() and get a decent idea of the major events in a job run, and where to look to get a better peek under the hood at any given task or event. Good luck! On Fri, Feb 1, 2013 at 4:59 PM, Gustavo Enrique Salazar Torres < gsala...@ime.usp.br> wrote: > Hi Ryan: > > It's the simplest thing: > 1. Define your type of parameters for a type of Vertex (for example > EdgeListVertex) > 2. Implement compute method. > > From what I saw out there in the M/R world, Giraph provides the simplest > way to work with graphs. > > Take a look at > https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example and > use release 0.1 (http://www.apache.org/dyn/closer.cgi/incubator/giraph/) > because 0.2-SNAPSHOT is under heavy work. > > Hope this helps you. > > Gustavo > > On Fri, Feb 1, 2013 at 9:17 PM, Ryan Compton <compton.r...@gmail.com>wrote: > >> I am having trouble understand what all the classes do and the >> documentation looks like it might be out of date. I searched around >> and found this: https://github.com/edaboussi/Giraph but it won't >> compile with 0.2, any suggestions? >> > > > >