To add, I ran it with workers=3, the job completed successfully and now the output has 6667 lines.
4 mappers, Mapper 000 (on master) Mapper 001 (on slave) - wrote out 3333 lines Mapper 002 (on slave) - wrote out 0 lines Mapper 003 (on slave) - wrote out 3334 lines Vishal On Tue, Aug 7, 2012 at 10:04 AM, Vishal Patel <write2vis...@gmail.com>wrote: > Yes, I can run upto 18 mappers, mapred.tasktracker.map.tasks is set to 4 > on the master and slave (there are only 2 machines). The > mapred.tasktracker.map.tasks.maximum is higher. > > So when I do worker=1, giraph started 2 map tasks. 1 completed, 1 failed > > STATUS: setup: Connected to Zookeeper service master:22181 > > java.lang.Throwable: Child Error > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) > Caused by: java.io.IOException: Task process exit with nonzero status of 1. > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) > > Task attempt_201208070927_0001_m_000000_1 failed to report status for 600 > seconds. Killing! > > Second mapper also turned to the same error after 5 mins, > > Task attempt_201208070927_0001_m_000001_0 failed to report status for 601 > seconds. Killing! > > > > > > > Next, I tried workers=2, and giraph started 3 mappers, no errors and the job > was "Completed" on hadoop's web interface. However the solution was incorrect > and > > /giraph_out/two/part-m-00001 had 0 lines > > > > > /giraph_out/two/part-m-00002 had 5000 lines > > > Here is the command line out, > 12/08/07 09:46:25 INFO mapred.JobClient: Running job: job_201208070927_0003 > 12/08/07 09:46:26 INFO mapred.JobClient: map 0% reduce 0% > > > > > 12/08/07 09:46:42 INFO mapred.JobClient: map 100% reduce 0% > 12/08/07 09:46:47 INFO mapred.JobClient: Job complete: job_201208070927_0003 > 12/08/07 09:46:47 INFO mapred.JobClient: Counters: 41 > 12/08/07 09:46:47 INFO mapred.JobClient: Job Counters > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=25701 > 12/08/07 09:46:47 INFO mapred.JobClient: Total time spent by all reduces > waiting after reserving slots (ms)=0 > 12/08/07 09:46:47 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Launched map tasks=3 > 12/08/07 09:46:47 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > 12/08/07 09:46:47 INFO mapred.JobClient: Giraph Timers > 12/08/07 09:46:47 INFO mapred.JobClient: Total (milliseconds)=4203 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 3 (milliseconds)=25 > 12/08/07 09:46:47 INFO mapred.JobClient: Vertex input superstep > (milliseconds)=426 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 4 (milliseconds)=69 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Setup (milliseconds)=3076 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 7 (milliseconds)=17 > 12/08/07 09:46:47 INFO mapred.JobClient: Shutdown (milliseconds)=93 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 0 (milliseconds)=197 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 8 (milliseconds)=62 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 9 (milliseconds)=21 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 6 (milliseconds)=59 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 5 (milliseconds)=19 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 2 (milliseconds)=78 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep 1 (milliseconds)=57 > 12/08/07 09:46:47 INFO mapred.JobClient: Giraph Stats > 12/08/07 09:46:47 INFO mapred.JobClient: Aggregate edges=10005 > 12/08/07 09:46:47 INFO mapred.JobClient: Superstep=10 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Last checkpointed superstep=8 > 12/08/07 09:46:47 INFO mapred.JobClient: Current workers=2 > 12/08/07 09:46:47 INFO mapred.JobClient: Current master task partition=0 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Sent messages=0 > 12/08/07 09:46:47 INFO mapred.JobClient: Aggregate finished vertices=5000 > 12/08/07 09:46:47 INFO mapred.JobClient: Aggregate vertices=5000 > 12/08/07 09:46:47 INFO mapred.JobClient: File Output Format Counters > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Bytes Written=0 > 12/08/07 09:46:47 INFO mapred.JobClient: FileSystemCounters > 12/08/07 09:46:47 INFO mapred.JobClient: FILE_BYTES_READ=236 > 12/08/07 09:46:47 INFO mapred.JobClient: HDFS_BYTES_READ=146766 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: FILE_BYTES_WRITTEN=66618 > 12/08/07 09:46:47 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=655800 > 12/08/07 09:46:47 INFO mapred.JobClient: File Input Format Counters > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Bytes Read=0 > > 12/08/07 09:46:47 INFO mapred.JobClient: Map-Reduce Framework > 12/08/07 09:46:47 INFO mapred.JobClient: Map input records=3 > 12/08/07 09:46:47 INFO mapred.JobClient: Physical memory (bytes) > snapshot=364912640 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Spilled Records=0 > 12/08/07 09:46:47 INFO mapred.JobClient: CPU time spent (ms)=3840 > 12/08/07 09:46:47 INFO mapred.JobClient: Total committed heap usage > (bytes)=602996736 > > > > > 12/08/07 09:46:47 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=9814392832 > 12/08/07 09:46:47 INFO mapred.JobClient: Map output records=0 > 12/08/07 09:46:47 INFO mapred.JobClient: SPLIT_RAW_BYTES=132 > > > > > > > From the Web interface, > > Mapper 000 (went to master), status: MASTER_ZOOKEEPER_ONLY - 2 finished out > of 2 on superstep 9 > Mapper 001 (went to slave), status: finishSuperstep: (all workers done) > WORKER_ONLY - Attempt=0, Superstep=10 > > > > > Mapper 002 (went to master), status: finishSuperstep: (all workers done) > WORKER_ONLY - Attempt=0, Superstep=10 > > > Here is the last 8KB syslog of Mapper 2 (Mapper 3 was similar), > > 2012-08-07 09:46:34,644 WARN org.apache.giraph.graph.BspService: process: > Unknown and unprocessed event > (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/6/_superstepFinished, > type=NodeDeleted, state=SyncConnected) > 2012-08-07 09:46:34,646 INFO org.apache.giraph.graph.BspServiceWorker: > registerHealth: Created my health node for attempt=0, superstep=8 with > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_workerHealthyDir/slave_1 > and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>, > MRpartition=1, port=30001) > 2012-08-07 09:46:34,650 INFO org.apache.giraph.graph.BspService: process: > partitionAssignmentsReadyChanged (partitions are assigned) > 2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker: > startSuperstep: Ready for computation on superstep 8 since worker selection > and vertex range assignments are done in > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments > 2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker: > getAggregatorValues: no aggregators in > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_mergedAggregatorDir > on superstep 8 > 2012-08-07 09:46:34,652 INFO org.apache.giraph.graph.BspServiceWorker: > exchangeVertexPartitions: Nothing to exchange, exiting early > 2012-08-07 09:46:34,674 INFO org.apache.giraph.graph.BspServiceWorker: > storeCheckpoint: Finished metadata > (_bsp/_checkpoints/job_201208070927_0003/8.slave_1.metadata) and vertices > (_bsp/_checkpoints/job_201208070927_0003/8.slave_1.vertices). > 2012-08-07 09:46:34,679 INFO org.apache.giraph.comm.BasicRPCCommunications: > flush: starting for superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.97296M > 2012-08-07 09:46:34,679 INFO org.apache.giraph.comm.BasicRPCCommunications: > flush: ended for superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.97285M > 2012-08-07 09:46:34,679 INFO org.apache.giraph.graph.BspServiceWorker: > finishSuperstep: Superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.97285M > 2012-08-07 09:46:34,689 INFO org.apache.giraph.graph.BspService: process: > superstepFinished signaled > 2012-08-07 09:46:34,690 INFO org.apache.giraph.graph.BspServiceWorker: > finishSuperstep: Completed superstep 8 with global stats > (vtx=5000,finVtx=5000,edges=10005,msgCount=20) > 2012-08-07 09:46:34,690 INFO org.apache.giraph.comm.BasicRPCCommunications: > prepareSuperstep: Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.97285M > 2012-08-07 09:46:34,696 INFO org.apache.giraph.graph.BspServiceWorker: > registerHealth: Created my health node for attempt=0, superstep=9 with > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_workerHealthyDir/slave_1 > and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>, > MRpartition=1, port=30001) > 2012-08-07 09:46:34,701 WARN org.apache.giraph.graph.BspService: process: > Unknown and unprocessed event > (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_partitionAssignments, > type=NodeDeleted, state=SyncConnected) > 2012-08-07 09:46:34,705 WARN org.apache.giraph.graph.BspService: process: > Unknown and unprocessed event > (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_superstepFinished, > type=NodeDeleted, state=SyncConnected) > 2012-08-07 09:46:34,714 INFO org.apache.giraph.graph.BspService: process: > partitionAssignmentsReadyChanged (partitions are assigned) > 2012-08-07 09:46:34,715 INFO org.apache.giraph.graph.BspServiceWorker: > startSuperstep: Ready for computation on superstep 9 since worker selection > and vertex range assignments are done in > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_partitionAssignments > 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: > getAggregatorValues: no aggregators in > /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_mergedAggregatorDir > on superstep 9 > 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: > exchangeVertexPartitions: Nothing to exchange, exiting early > 2012-08-07 09:46:34,716 INFO org.apache.giraph.comm.BasicRPCCommunications: > flush: starting for superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.47278M > 2012-08-07 09:46:34,716 INFO org.apache.giraph.comm.BasicRPCCommunications: > flush: ended for superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.47269M > 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: > finishSuperstep: Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, > freeMem = 164.47269M > 2012-08-07 09:46:34,721 INFO org.apache.giraph.graph.BspService: process: > superstepFinished signaled > 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.BspServiceWorker: > finishSuperstep: Completed superstep 9 with global stats > (vtx=5000,finVtx=5000,edges=10005,msgCount=0) > 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper: map: BSP > application done (global vertices marked done) > 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper: cleanup: > Starting for WORKER_ONLY > 2012-08-07 09:46:34,722 WARN org.apache.giraph.graph.BspService: process: > Unknown and unprocessed event > (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments, > type=NodeDeleted, state=SyncConnected) > 2012-08-07 09:46:34,726 WARN org.apache.giraph.graph.BspService: process: > Unknown and unprocessed event > (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_superstepFinished, > type=NodeDeleted, state=SyncConnected) > 2012-08-07 09:46:34,729 INFO org.apache.giraph.graph.BspServiceWorker: > cleanup: Notifying master its okay to cleanup with > /_hadoopBsp/job_201208070927_0003/_cleanedUpDir/1_worker > 2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x138fe1c4699004a closed > 2012-08-07 09:46:34,731 INFO org.apache.giraph.comm.BasicRPCCommunications: > close: shutting down RPC server > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping server on > 30011 > 2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 2 on 30011: exiting > 2012-08-07 09:46:34,731 INFO > org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 30011: exiting > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server Responder > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping IPC > Server listener on 30011 > 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 1 on 30011: exiting > 2012-08-07 09:46:34,731 INFO org.apache.giraph.zk.ZooKeeperManager: > createZooKeeperClosedStamp: Creating my filestamp > _bsp/_defaultZkManagerDir/job_201208070927_0003/_task/1.COMPUTATION_DONE > 2012-08-07 09:46:34,736 INFO org.apache.hadoop.mapred.Task: > Task:attempt_201208070927_0003_m_000001_0 is done. And is in the process of > commiting > 2012-08-07 09:46:35,837 INFO org.apache.hadoop.mapred.Task: Task > attempt_201208070927_0003_m_000001_0 is allowed to commit now > 2012-08-07 09:46:35,848 INFO > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of > task 'attempt_201208070927_0003_m_000001_0' to > hdfs:/user/vpatel/giraph_out/two > 2012-08-07 09:46:37,776 INFO org.apache.hadoop.mapred.Task: Task > 'attempt_201208070927_0003_m_000001_0' done. > 2012-08-07 09:46:37,779 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2012-08-07 09:46:37,812 INFO org.apache.hadoop.io.nativeio.NativeIO: > Initialized cache for UID to User mapping with a cache timeout of 14400 > seconds. > 2012-08-07 09:46:37,813 INFO org.apache.hadoop.io.nativeio.NativeIO: Got > UserName vpatel for UID 10020 from the native implementation > > > I have attached the network file to this email. It has 10,000 lines > corresponding to the 10,000 nodes in the adjacency list format (tab > separated). > > > > Here is jps from master: > 5178 TaskTracker > 4662 DataNode > 4491 NameNode > 34115 RunJar > 4865 SecondaryNameNode > 8385 Jps > 29410 QuorumPeerMain > 4991 JobTracker > > jps from slave > 48621 TaskTracker > 48464 DataNode > 51391 Jps > > > Thank you again for your help, > > Vishal > > > > > On Tue, Aug 7, 2012 at 12:07 AM, Sebastian Schelter <s...@apache.org>wrote: > >> Can you check what the mappers where doing via the web interface of >> Hadoop? Can you run 4 mappers at once? >> >> >> >> On 07.08.2012 01:46, Vishal Patel wrote: >> > I'm seeing a strange behavior that I can't explain. >> > >> > >> > hadoop jar giraph-0.1-jar-with-dependencies.jar >> > org.apache.giraph.GiraphRunner >> > org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat >> > org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath >> > /user/vpatel/graph_in/elist.txt --outputFormat >> > org.apache.giraph.examples.VertexWithComponentTextOutputFormat >> --outputPath >> > hdfs:///user/vpatel/giraph_out/1 --workers 4 --combiner >> > org.apache.giraph.examples.MinimumIntCombiner >> > Warning: $HADOOP_HOME is deprecated. >> > >> > 12/08/06 16:16:40 INFO mapred.JobClient: Running job: >> job_201208031459_0591 >> > 12/08/06 16:16:41 INFO mapred.JobClient: map 0% reduce 0% >> > 12/08/06 16:16:59 INFO mapred.JobClient: map 20% reduce 0% >> > 12/08/06 16:17:05 INFO mapred.JobClient: map 40% reduce 0% >> > 12/08/06 16:17:08 INFO mapred.JobClient: map 100% reduce 0% >> > 12/08/06 16:17:11 INFO mapred.JobClient: map 80% reduce 0% >> > 12/08/06 16:17:16 INFO mapred.JobClient: Task Id : >> > attempt_201208031459_0591_m_000000_0, Status : FAILED >> > *java.lang.Throwable: Child Error >> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) >> > Caused by: java.io.IOException: Task process exit with nonzero status >> of 1. >> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) >> > * >> > >> > I either get the above error, which I can avoid if I decrease my number >> of >> > workers (based on previous post on the mailing list). >> > >> > However when I do specify lesser workers (say 2) or sometimes I don't >> get >> > the above error: the result is missing for one part in the hdfs. >> > i.e. when I did workers=2, I got two parts. One of them had 5,000 out of >> > the 10k nodes and other part was blank. This happens when I did >> workers=4,5 >> > etc as well. >> > >> > There are no errors in the log. >> > >> > Just to be clear, the input format is adjacency list, >> > i.e if a -> b, a ->c and b -> d then >> > a b c >> > b a d >> > c a >> > d b >> > >> > Since the graph is undirected. Any idea what could be wrong? >> > >> > Here is the log when I do workers=1 >> > >> > Finally loaded a total of *(v=10000, e=19996)* >> > 2012-08-06 16:39:13,902 INFO org.apache.giraph.graph.BspService: >> > process: inputSplitsAllDoneChanged (all vertices sent from input >> > splits) >> > 2012-08-06 16:39:13,904 INFO >> > org.apache.giraph.comm.BasicRPCCommunications: flush: starting for >> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = >> > 164.6044M >> > 2012-08-06 16:39:13,906 INFO >> > org.apache.giraph.comm.BasicRPCCommunications: flush: ended for >> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = >> > 164.60431M >> > 2012-08-06 16:39:13,906 INFO org.apache.giraph.graph.BspServiceWorker: >> > finishSuperstep: Superstep -1 totalMem = 191.6875M, maxMem = >> > 191.6875M, freeMem = 164.60431M >> > 2012-08-06 16:39:13,922 INFO org.apache.giraph.graph.BspService: >> > process: superstepFinished signaled >> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.BspServiceWorker: >> > finishSuperstep: Completed superstep -1 with global stats >> > (vtx=0,finVtx=0,edges=0,msgCount=0) >> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.GraphMapper: >> > cleanup: Starting for WORKER_ONLY >> > 2012-08-06 16:39:13,925 INFO org.apache.giraph.graph.BspServiceWorker: >> > processEvent: Job state changed, checking to see if it needs to >> > restart >> > 2012-08-06 16:39:13,926 INFO org.apache.giraph.graph.BspService: >> > getJobState: Job state already exists >> > (/_hadoopBsp/job_201208031459_0621/_masterJobState) >> > 2012-08-06 16:39:13,929 INFO org.apache.giraph.graph.BspServiceWorker: >> > cleanup: Notifying master its okay to cleanup with >> > /_hadoopBsp/job_201208031459_0621/_cleanedUpDir/1_worker >> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x138fe1c4699003d closed >> > 2012-08-06 16:39:13,930 INFO >> > org.apache.giraph.comm.BasicRPCCommunications: close: shutting down >> > RPC server >> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ClientCnxn: >> > EventThread shut down >> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping >> > server on 30003 >> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: IPC Server >> > handler 0 on 30003: exiting >> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping >> > IPC Server listener on 30003 >> > 2012-08-06 16:39:13,930 INFO >> > org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down >> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: Stopping >> > IPC Server Responder >> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: IPC Server >> > handler 1 on 30003: exiting >> > 2012-08-06 16:39:13,931 INFO org.apache.giraph.zk.ZooKeeperManager: >> > createZooKeeperClosedStamp: Creating my filestamp >> > _bsp/_defaultZkManagerDir/job_201208031459_0621/_task/1.COMPUTATION_DONE >> > 2012-08-06 16:39:13,934 INFO org.apache.hadoop.mapred.Task: >> > Task:attempt_201208031459_0621_m_000001_0 is done. And is in the >> > process of commiting >> > 2012-08-06 16:39:15,026 INFO org.apache.hadoop.mapred.Task: Task >> > attempt_201208031459_0621_m_000001_0 is allowed to commit now >> > 2012-08-06 16:39:15,036 INFO >> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved >> > output of task 'attempt_201208031459_0621_m_000001_0' to >> > hdfs:/user/vpatel/giraph_out/one >> > 2012-08-06 16:39:16,068 INFO org.apache.hadoop.mapred.Task: Task >> > 'attempt_201208031459_0621_m_000001_0' done. >> > 2012-08-06 16:39:16,087 INFO >> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' >> > truncater with mapRetainSize=-1 and reduceRetainSize=-1 >> > 2012-08-06 16:39:16,117 INFO org.apache.hadoop.io.nativeio.NativeIO: >> > Initialized cache for UID to User mapping with a cache timeout of >> > 14400 seconds. >> > 2012-08-06 16:39:16,118 INFO org.apache.hadoop.io.nativeio.NativeIO: >> > Got UserName vpatel for UID 10020 from the native implementation >> > >> > >> > >> > >> > On Mon, Aug 6, 2012 at 3:05 PM, Sebastian Schelter <s...@apache.org> >> wrote: >> > >> >> The job expects the input data in adjacency list format, each line >> >> should look like: >> >> >> >> vertex neighbor1 neighbor2 .... >> >> >> >> --sebastian >> >> >> >> >> >> On 07.08.2012 00:02, Vishal Patel wrote: >> >>> Thanks Sebastian, it runs fine now. However, the output comes back as >> >>> >> >>> 0 0 >> >>> 1 1 >> >>> 2 2 >> >>> 3 3 >> >>> 4 4 >> >>> 5 5 >> >>> 6 6 >> >>> .. >> >>> >> >>> I have an unsorted edge file with just int values. >> >>> http://www.ics.uci.edu/~vishalrp/public/testg.txt >> >>> >> >>> My test graph (head below) has 10,000 nodes ( from 0 to 9999) and 9998 >> >>> edges. There are 4 connected components in the graph. >> >>> >> >>> 0 5800 >> >>> 0 5981 >> >>> 1 1239 >> >>> 1 2989 >> >>> 1 3961 >> >>> 2 5417 >> >>> 2 7350 >> >>> >> >>> What am I doing wrong? Also, in general does the graph have to have >> int >> >>> values for nodes? Or can I have strings? >> >>> >> >>> Appreciate your help! >> >>> >> >>> Vishal >> >>> >> >>> >> >>> >> >>> >> >>> On Mon, Aug 6, 2012 at 2:22 PM, Sebastian Schelter <s...@apache.org> >> >> wrote: >> >>> >> >>>> You cannot run the vertex class directly. Instead you can use >> >>>> GiraphRunner, e.g. >> >>>> >> >>>> hadoop jar giraph-jar-with-dependencies.jar >> >>>> org.apache.giraph.GiraphRunner >> >>>> org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat >> >>>> org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath >> >>>> hdfs:///path/to/input --outputFormat >> >>>> org.apache.giraph.examples.VertexWithComponentTextOutputFormat >> >>>> --outputPath hdfs:///path/to/output --workers numWorkers --combiner >> >>>> org.apache.giraph.examples.MinimumIntCombiner >> >>>> >> >>>> --sebastian >> >>>> >> >>>> >> >>>> 2012/8/6 Vishal Patel <write2vis...@gmail.com>: >> >>>>> Hi, I am trying to run the connected-components example. I have >> giraph >> >>>>> installed, all the test pass on a 3 node cluster running >> hadoop-1.0.3/ >> >>>>> >> >>>>> The main method is missing in the ConnectedComponentsVertex class >> >>>>> >> >>>>> cd target/classes >> >>>>> hadoop jar ../giraph-0.1-jar-with-dependencies.jar >> >>>>> org.apache.giraph.examples.ConnectedComponentsVertex >> >>>>> >> >>>>> Exception in thread "main" java.lang.NoSuchMethodException: >> >>>>> >> >>>> >> >> >> org.apache.giraph.examples.ConnectedComponentsVertex.main([Ljava.lang.String;) >> >>>>> at java.lang.Class.getMethod(Class.java:1622) >> >>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:150) >> >>>>> >> >>>>> Can someone please help me with running this example? >> >>>>> >> >>>>> Vishal >> >>>>> >> >>>> >> >>> >> >> >> >> >> > >> >> >