Look at the io/ and example/ directories for the pieces you might need to
cobble together to get a real application working. If you can reuse an
existing IO format, do it. If not, you will have to mirror the patterns in
io/ for that piece. Otherwise the examples are the best place to start. To
get an idea of how some of the application code interacts with the
framework check out the benchmark/ example code (which runs itself directly
as a Hadoop job rather than using the bin/giraph and GraphRunner as app
code often does) and especially the unit tests. For the example you
mention, check out the tests and the IO format to be sure it expects the
line-by-line formatting your text input is in.

More docs are forthcoming, sorry about that, the codebase has changed a lot
in a short period of time. I think before long we will be seeing some more
comprehensive documentation. Also, take a look at the mailing list
archives, there are similar questions to yours in there.


On Tue, Jan 15, 2013 at 9:08 AM, Kammer, John <kamm...@bit-sys.com> wrote:

>  Hello,
>    Still rather new to Giraph and Hadoop but trying to research and learn
> the software.
>
>     Am trying to get the SimpleShortestPathsVertex in the examples
> directory to run from within Eclipse. I would swear that it ran and
> produced results once last week, but have been unable to get the results
> output from the code since. If I am reading this output correctly it
> appears that the application is not (currently) reading the input data,
> although I have no idea why that would be the case.
>
>    Input args:
>     /home/hduser/Downloads/**shortestPathsInputGraph/
>     /home/hduser/Downloads/**shortestPathsOutputGraph
>     1
>     1
>
>  The three files in the input directory are as follows:
> part-m-00001  part-m-00002  part-m-00003
>  and contain:
> [10,4500,[[11,2000]]]
> [11,5500,[[12,1100]]]
> [12,6600,[[13,1200]]]
> [13,7800,[[14,1300]]]
> [14,9100,[[0,1400]]]
>
> [5,1000,[[6,500]]]
> [6,1500,[[7,600]]]
> [7,2100,[[8,700]]]
> [8,2800,[[9,800]]]
> [9,3600,[[10,900]]]
>
> [0,0,[[1,0]]]
> [1,0,[[2,100]]]
> [2,100,[[3,200]]]
> [3,300,[[4,300]]]
> [4,600,[[5,400]]]
>
>  Running with the above results in creation of the output directory
> having an empty file named _success within. No accompanying  results.
>
>  Log messages from the run are as follows:
> : 13/01/14 16:44:45 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> 13/01/14 16:44:45 INFO mapred.JobClient: Running job: job_local_0001
> 13/01/14 16:44:45 INFO util.ProcessTree: setsid exited with exit code 0
> 13/01/14 16:44:45 INFO mapred.Task:  Using ResourceCalculatorPlugin :
> org.apache.hadoop.util.**LinuxResourceCalculatorPlugin@**676c6370
> 13/01/14 16:44:45 INFO graph.GraphMapper: Distributed cache is empty.
> Assuming fatjar.
> 13/01/14 16:44:45 INFO graph.GraphMapper: setup: classpath @
> /tmp/hadoop-hduser/mapred/**staging/hduser-418008403/.**
> staging/job_local_0001/job.jar
> 13/01/14 16:44:45 INFO zk.ZooKeeperManager: createCandidateStamp: Made the
> directory _bsp/_defaultZkManagerDir/job_**local_0001
> 13/01/14 16:44:45 INFO zk.ZooKeeperManager: createCandidateStamp: Creating
> my filestamp _bsp/_defaultZkManagerDir/job_**local_0001/_task/bitskammer 0
> 13/01/14 16:44:45 INFO zk.ZooKeeperManager: getZooKeeperServerList: For
> task 0, got file 'zkServerList_bitskammer 0 ' (polling period is 3000)
> 13/01/14 16:44:45 INFO zk.ZooKeeperManager: getZooKeeperServerList: Found
> [bitskammer, 0] 2 hosts in filename 'zkServerList_bitskammer 0 '
> 13/01/14 16:44:45 INFO graph.GraphMapper: cleanup: Starting for UNKNOWN
> 13/01/14 16:44:45 INFO mapred.Task: Task:attempt_local_0001_m_**000000_0
> is done. And is in the process of commiting
> 13/01/14 16:44:46 INFO mapred.JobClient:  map 0% reduce 0%
> 13/01/14 16:44:48 INFO mapred.LocalJobRunner: setup: Setting up Zookeeper
> manager.
> 13/01/14 16:44:48 INFO mapred.Task: Task 'attempt_local_0001_m_000000_**0'
> done.
> 13/01/14 16:44:49 INFO mapred.JobClient:  map 100% reduce 0%
> 13/01/14 16:44:49 INFO mapred.JobClient: Job complete: job_local_0001
> 13/01/14 16:44:49 INFO mapred.JobClient: Counters: 12
> 13/01/14 16:44:49 INFO mapred.JobClient:   File Output Format Counters
> 13/01/14 16:44:49 INFO mapred.JobClient:     Bytes Written=0
> 13/01/14 16:44:49 INFO mapred.JobClient:   File Input Format Counters
> 13/01/14 16:44:49 INFO mapred.JobClient:     Bytes Read=0
> 13/01/14 16:44:49 INFO mapred.JobClient:   FileSystemCounters
> 13/01/14 16:44:49 INFO mapred.JobClient:     FILE_BYTES_READ=5138685
> 13/01/14 16:44:49 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=5212576
> 13/01/14 16:44:49 INFO mapred.JobClient:   Map-Reduce Framework
> 13/01/14 16:44:49 INFO mapred.JobClient:     Map input records=1
> 13/01/14 16:44:49 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=0
> 13/01/14 16:44:49 INFO mapred.JobClient:     Spilled Records=0
> 13/01/14 16:44:49 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=60162048
> 13/01/14 16:44:49 INFO mapred.JobClient:     CPU time spent (ms)=0
> 13/01/14 16:44:49 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=0
> 13/01/14 16:44:49 INFO mapred.JobClient:     SPLIT_RAW_BYTES=44
> 13/01/14 16:44:49 INFO mapred.JobClient:     Map output records=0
>
>
>
>  Any help would be appreciated.
>
>  Is there by any chance a tutorial anywhere on how to build your own
> Giraph application? I've been reading through source code but feel like
> I've started somewhere in the middle and can never find the starting place
> - just going around in circles. I could really use a "start-here"
> reference.
>
>  In any event, thanks for any and all help!
>

Reply via email to