Steven, Could you please try your application again with http://people.apache.org/~edwardyoon/dist/test/ and feedback me whether it works correctly as you expected?
On Wed, Apr 24, 2013 at 4:53 PM, Edward J. Yoon <[email protected]> wrote: > Thanks for your report. It could be a bug. I'll have a look at it now. > > On Wed, Apr 24, 2013 at 4:48 PM, Steven van Beelen <[email protected]> > wrote: >> I'm running version 0.6.1. >> Looking at the results I found through testing, >> >> public void aggregateVertex(M lastValue, Vertex<V, E, M> v) >> >> doesn't seem to be the problem. Both 'aggregate(v, v.getValue())' and >> 'aggregate(v, lastValue, v.getValue())' >> are called correctly and work on the same values. >> >> However, when finalizing through 'finalizeAggregation()' in the >> 'public void doMasterAggregation(MapWritable updatedCnt)' method, >> >> the value aggregated upon by 'aggregate(v, lastValue, v.getValue())' >> is lost. That is what happens at me. >> >> Could it be that I'm implementing the aggregate methods incorrect? >> >> In the end however, I can not find a direct bug in TRUNK[1], although >> it is not clear to me what/which part of the code was changed through >> the ticket on JIRA. >> >> >> >> >> On Wed, Apr 24, 2013 at 2:41 AM, Edward J. Yoon <[email protected]>wrote: >> >>> I found the ticket on JIRA - >>> https://issues.apache.org/jira/browse/HAMA-659 >>> >>> And it seems already fixed. >>> >>> What is your version of hama here? and can you find some bug in TRUNK[1]? >>> >>> 1. >>> http://svn.apache.org/repos/asf/hama/trunk/graph/src/main/java/org/apache/hama/graph/AggregationRunner.java >>> >>> On Tue, Apr 23, 2013 at 9:41 PM, Steven van Beelen <[email protected]> >>> wrote: >>> > Could anyone tell me if I'm correct concerning the possible problem I >>> > posted and replied on in the previous two emails? >>> > >>> > >>> > On Wed, Apr 17, 2013 at 5:08 PM, Steven van Beelen <[email protected] >>> >wrote: >>> > >>> >> Additionally, I found this in the mail archives: >>> >> >>> >> >>> http://mail-archives.apache.org/mod_mbox/hama-user/201210.mbox/%3CCAJ-=ys=W8F5W4aduV+=+yfsvh41xsa22-wnqqrkapadzd+q...@mail.gmail.com%3E >>> >> This actually exactly covers my point. Is this still considered as a >>> bug, >>> >> calling two different aggregate functions in a row? >>> >> >>> >> >>> >> On Wed, Apr 17, 2013 at 2:35 PM, Steven van Beelen < >>> [email protected]>wrote: >>> >> >>> >>> Hi Thomas, >>> >>> >>> >>> Then I guess I did not explain myself clearly. >>> >>> What you describe is indeed how I think of the AverageAggregator to >>> work, >>> >>> but if I use the AverageAggregator in my own PageRank implementation it >>> >>> does not return >>> >>> the average of all absolute differences but just the average of the sum >>> >>> of all values. >>> >>> >>> >>> The (very) small example graph I use has only five vertices, were the >>> sum >>> >>> of every vertice it's value is always 1.0. >>> >>> When I use the AverageAggregator it will always return 0.2 when calling >>> >>> the getLastAggregatedValue method. >>> >>> It shouldn't do that right? >>> >>> >>> >>> >>> >>> On Wed, Apr 17, 2013 at 1:18 PM, Thomas Jungblut < >>> >>> [email protected]> wrote: >>> >>> >>> >>>> Hi Steven, >>> >>>> >>> >>>> the AverageAggregator is used to determine the average of all absolute >>> >>>> differences between old pagerank and new pagerank for every vertex. >>> >>>> This is documented like it should behave in the javadoc of the given >>> >>>> classes and suffices to track if pagerank values have yet converged or >>> >>>> not. >>> >>>> >>> >>>> What you describe is a perfectly valid way to track the pagerank >>> >>>> difference >>> >>>> throughout all supersteps. But this is not how (imho) the >>> >>>> AverageAggregator >>> >>>> should behave, so you have to write your own. >>> >>>> >>> >>>> >>> >>>> 2013/4/17 Steven van Beelen <[email protected]> >>> >>>> >>> >>>> > The values in my case are the DoubleWritable values each vertice has >>> >>>> and >>> >>>> > the aggregators aggregate on. >>> >>>> > My tests showed that, when the aggregator was set to >>> >>>> AverageAggregator, the >>> >>>> > average of all the vertice values from the past compute step were >>> >>>> returned. >>> >>>> > Actually, AverageAggregator should return the average difference of >>> >>>> all the >>> >>>> > old-new value pairs of every vertice instead of the mean. >>> >>>> > The average difference is then used to check whether convergence is >>> >>>> > reached, which is relevant for all task ofcourse. >>> >>>> > >>> >>>> > Hence, the convergence point, for which the Aggregator is used, will >>> >>>> not be >>> >>>> > reached. >>> >>>> > This thus makes it so that the algorithm will just run the maximum >>> >>>> number >>> >>>> > of iterations set (30 iterations on the PageRank example) in every >>> >>>> case. >>> >>>> > I experienced the same with my own PageRank implementation. >>> >>>> > >>> >>>> > I think it has something to do with the finalizeAggregation step >>> taken. >>> >>>> > Next to that, both the 'aggregate(VERTEX vertex, M value)' and >>> >>>> > 'aggregate(VERTEX vertex, M oldValue, M newValue)' methods are >>> called >>> >>>> every >>> >>>> > time, were one would think only the second (with old/new values) >>> would >>> >>>> > suffice. >>> >>>> > Because of this, the global variable 'absoluteDifference' in the >>> >>>> > 'AbsDiffAggregator' class is overwriten/overruled by the first >>> >>>> aggregate. >>> >>>> > Additionally, if one would make its own Aggregation class in the >>> same >>> >>>> > fashion as AbsDiffAggregator and AverageAggregator, but leave out >>> the >>> >>>> > 'aggregate(VERTEX vertex, M value)', my output turned out to be >>> 0.0000 >>> >>>> > every time. >>> >>>> > >>> >>>> > I hope I made myself clear. >>> >>>> > Regards >>> >>>> > >>> >>>> > >>> >>>> > On Wed, Apr 17, 2013 at 11:57 AM, Edward J. Yoon < >>> >>>> [email protected] >>> >>>> > >wrote: >>> >>>> > >>> >>>> > > Thanks for your report. >>> >>>> > > >>> >>>> > > What's the meaning of 'all the values'? Please give me more >>> details >>> >>>> > > about your problem. >>> >>>> > > >>> >>>> > > I didn't look at 'dangling links & aggregators' part of PageRank >>> >>>> > > example closely, but I think there's no bug. Aggregators is just >>> used >>> >>>> > > for global communication. For example, finding max value[1] can be >>> >>>> > > done in only one iteration using MaxValueAggregator. >>> >>>> > > >>> >>>> > > 1. >>> >>>> http://cdn.dejanseo.com.au/wp-content/uploads/2011/06/supersteps.png >>> >>>> > > >>> >>>> > > On Wed, Apr 17, 2013 at 6:27 PM, Steven van Beelen < >>> >>>> [email protected] >>> >>>> > > >>> >>>> > > wrote: >>> >>>> > > > Hello, >>> >>>> > > > >>> >>>> > > > I'm creating my own pagerank in hama for a testing and I think I >>> >>>> found >>> >>>> > a >>> >>>> > > > problem with the AverageAggregator. I'm not sure if it is me or >>> >>>> the the >>> >>>> > > > AverageAggregator class in general, but I believe it just >>> returns >>> >>>> the >>> >>>> > > mean >>> >>>> > > > of all the values instead of the average difference between the >>> >>>> old and >>> >>>> > > new >>> >>>> > > > value as intended. >>> >>>> > > > >>> >>>> > > > For testing, I created my own AbsDiffAggregator and >>> >>>> AverageAggregator >>> >>>> > > > classes, using FloatWritable instead of DoubleWritables. The >>> same >>> >>>> > problem >>> >>>> > > > still occured: I got a mean of all the values in the graph >>> instead >>> >>>> of >>> >>>> > an >>> >>>> > > > average difference. >>> >>>> > > > >>> >>>> > > > Could someone tell me if I'm doing something wrong or what I >>> should >>> >>>> > > provide >>> >>>> > > > to better explain my problem? >>> >>>> > > > >>> >>>> > > > Regards, >>> >>>> > > > Steven van Beelen, Vrije Universiteit of Amsterdam >>> >>>> > > >>> >>>> > > >>> >>>> > > >>> >>>> > > -- >>> >>>> > > Best Regards, Edward J. Yoon >>> >>>> > > @eddieyoon >>> >>>> > > >>> >>>> > >>> >>>> >>> >>> >>> >>> >>> >> >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> @eddieyoon >>> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
