Your best bet is to look over the two code components that users most often
have to tweak or implement to write application code. That is, the Vertex
implementations in examples/ and benchmark/ and the IO formats and related
goodies like RecordReaders etc. that are mostly in the io/ dir. You might
also take a look at the test suite for some quick ideas of how some of the
moving parts fit together.

If you have real work to do with Giraph, you're going to need to get used
to 0.2 and its API. The old API is both limited in what kind of data it
will process, and not compatible into the future. The API we have now,
while evolving, is much much closer to being "final" than anything in 0.1
And regardless, we now have (in hindsight) the sure knowledge that none of
the code you write for 0.1 will be portable into the future.

I am first in line to be sorry about the state of the docs. There are
efforts underway now to fix this.  We all owe the users a collective
apology for this. In lieu of proper apologies, feel free to ask any and all
questions, no matter how dumb, they can't be as dumb as mine! The codebase
is under heavy development and has a lot of confusingly-named moving parts
so first get used to the plumbing an app writer has to know to function,
get some apps up and running, then dig into the framework code and it will
make more sense.

One string to pull on to begin to look inside the framework is bin/giraph
-> org.apache.giraph.GiraphRunner (hands job to Hadoop) -> ... ->
o.a.g.graph.GraphMapper (is a mapper instance on a Hadoop cluster, started
according to the Job sumbitted to Hadoop, but running our BSP code instead)
-> o.a.g.graph.GraphTaskManager -> lots of places from there...

The overarching BSP activity management for a single job run is basically
all stemming out of GraphTaskManager now. You can look at setup() and
execute() and get a decent idea of the major events in a job run, and where
to look to get a better peek under the hood at any given task or event.
Good luck!


On Fri, Feb 1, 2013 at 4:59 PM, Gustavo Enrique Salazar Torres <
gsala...@ime.usp.br> wrote:

> Hi Ryan:
>
> It's the simplest thing:
> 1. Define your type of parameters for a type of Vertex (for example
> EdgeListVertex)
> 2. Implement compute method.
>
> From what I saw out there in the M/R world, Giraph provides the simplest
> way to work with graphs.
>
> Take a look at
> https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example and
> use release 0.1 (http://www.apache.org/dyn/closer.cgi/incubator/giraph/)
> because 0.2-SNAPSHOT is under heavy work.
>
> Hope this helps you.
>
> Gustavo
>
> On Fri, Feb 1, 2013 at 9:17 PM, Ryan Compton <compton.r...@gmail.com>wrote:
>
>> I am having trouble understand what all the classes do and the
>> documentation looks like it might be out of date. I searched around
>> and found this: https://github.com/edaboussi/Giraph but it won't
>> compile with 0.2, any suggestions?
>>
>
>
>
>

Reply via email to