Re: duplicate edges created with TextVertexInputFormat

2014-01-30 Thread Eric Kimbrel
I don’t see how i could possibly pass in the input file twice. I am simple using GiraphRunner and specifying -vif and -vip, the path only contains the file once. I do agree though that it is being read in as two identical input splits. In fact you see this line hdfs://arcus1.silverdale.dev/t

duplicate edges created with TextVertexInputFormat

2014-01-29 Thread Eric Kimbrel
I am reading in an adjacency list using an input format which extends TextVertexInputFormat. My code doesn’t do anything to address input splits, but leaves that to the underlying giraph implementation. However it appears that as the data is being read 2 identical input splits are created and

giraph on YARN logging

2014-01-03 Thread Eric Kimbrel
I have used graph on mr-v1 successfully over the last year, but have recently updated to using YARN and I am having a lot of trouble. My biggest problem is that I can’t figure out how to actually look at the logs from my job and figure out what has gone wrong and why it is failing. I can view

yarn container accounting errors.

2014-01-03 Thread Eric Kimbrel
I am running on hadoop 2.2.0-cdh5.0.0-beta-1 with pure yarn mode. I = have noticed two issues in org.apache.giraph.yarn.GiraphYarnClient giraph code base is 1.1.0-SNAPSHOT, downloaded on Jan 2, 2014 looking at the checkPerNodeResourcesAvailable Method. The first issue is that the number of con

Re: Broadcast of large aggregated value is slow.

2013-05-16 Thread Eric Kimbrel
One of the attached logs is worker 13, During this time period it is waiting for an aggregator request so that it can start the super step. Eric Kimbrel Software Engineer I Data Fusion & Analytics Sotera Defense Solutions, Inc. o: 360-516-6621 c: 360-990-1873 e: eric.kimb...@soteradefense

Re: Broadcast of large aggregated value is slow.

2013-05-16 Thread Eric Kimbrel
>From the attached logs in original post, you can see that both workers use >about 4 seconds of compute time on super step 4, but they complete super step >4 about 10 minutes apart. Eric Kimbrel Software Engineer I Data Fusion & Analytics Sotera Defense Solutions, Inc. o: 360-51