I found the ticket on JIRA - https://issues.apache.org/jira/browse/HAMA-659
And it seems already fixed. What is your version of hama here? and can you find some bug in TRUNK[1]? 1. http://svn.apache.org/repos/asf/hama/trunk/graph/src/main/java/org/apache/hama/graph/AggregationRunner.java On Tue, Apr 23, 2013 at 9:41 PM, Steven van Beelen <[email protected]> wrote: > Could anyone tell me if I'm correct concerning the possible problem I > posted and replied on in the previous two emails? > > > On Wed, Apr 17, 2013 at 5:08 PM, Steven van Beelen > <[email protected]>wrote: > >> Additionally, I found this in the mail archives: >> >> http://mail-archives.apache.org/mod_mbox/hama-user/201210.mbox/%3CCAJ-=ys=W8F5W4aduV+=+yfsvh41xsa22-wnqqrkapadzd+q...@mail.gmail.com%3E >> This actually exactly covers my point. Is this still considered as a bug, >> calling two different aggregate functions in a row? >> >> >> On Wed, Apr 17, 2013 at 2:35 PM, Steven van Beelen >> <[email protected]>wrote: >> >>> Hi Thomas, >>> >>> Then I guess I did not explain myself clearly. >>> What you describe is indeed how I think of the AverageAggregator to work, >>> but if I use the AverageAggregator in my own PageRank implementation it >>> does not return >>> the average of all absolute differences but just the average of the sum >>> of all values. >>> >>> The (very) small example graph I use has only five vertices, were the sum >>> of every vertice it's value is always 1.0. >>> When I use the AverageAggregator it will always return 0.2 when calling >>> the getLastAggregatedValue method. >>> It shouldn't do that right? >>> >>> >>> On Wed, Apr 17, 2013 at 1:18 PM, Thomas Jungblut < >>> [email protected]> wrote: >>> >>>> Hi Steven, >>>> >>>> the AverageAggregator is used to determine the average of all absolute >>>> differences between old pagerank and new pagerank for every vertex. >>>> This is documented like it should behave in the javadoc of the given >>>> classes and suffices to track if pagerank values have yet converged or >>>> not. >>>> >>>> What you describe is a perfectly valid way to track the pagerank >>>> difference >>>> throughout all supersteps. But this is not how (imho) the >>>> AverageAggregator >>>> should behave, so you have to write your own. >>>> >>>> >>>> 2013/4/17 Steven van Beelen <[email protected]> >>>> >>>> > The values in my case are the DoubleWritable values each vertice has >>>> and >>>> > the aggregators aggregate on. >>>> > My tests showed that, when the aggregator was set to >>>> AverageAggregator, the >>>> > average of all the vertice values from the past compute step were >>>> returned. >>>> > Actually, AverageAggregator should return the average difference of >>>> all the >>>> > old-new value pairs of every vertice instead of the mean. >>>> > The average difference is then used to check whether convergence is >>>> > reached, which is relevant for all task ofcourse. >>>> > >>>> > Hence, the convergence point, for which the Aggregator is used, will >>>> not be >>>> > reached. >>>> > This thus makes it so that the algorithm will just run the maximum >>>> number >>>> > of iterations set (30 iterations on the PageRank example) in every >>>> case. >>>> > I experienced the same with my own PageRank implementation. >>>> > >>>> > I think it has something to do with the finalizeAggregation step taken. >>>> > Next to that, both the 'aggregate(VERTEX vertex, M value)' and >>>> > 'aggregate(VERTEX vertex, M oldValue, M newValue)' methods are called >>>> every >>>> > time, were one would think only the second (with old/new values) would >>>> > suffice. >>>> > Because of this, the global variable 'absoluteDifference' in the >>>> > 'AbsDiffAggregator' class is overwriten/overruled by the first >>>> aggregate. >>>> > Additionally, if one would make its own Aggregation class in the same >>>> > fashion as AbsDiffAggregator and AverageAggregator, but leave out the >>>> > 'aggregate(VERTEX vertex, M value)', my output turned out to be 0.0000 >>>> > every time. >>>> > >>>> > I hope I made myself clear. >>>> > Regards >>>> > >>>> > >>>> > On Wed, Apr 17, 2013 at 11:57 AM, Edward J. Yoon < >>>> [email protected] >>>> > >wrote: >>>> > >>>> > > Thanks for your report. >>>> > > >>>> > > What's the meaning of 'all the values'? Please give me more details >>>> > > about your problem. >>>> > > >>>> > > I didn't look at 'dangling links & aggregators' part of PageRank >>>> > > example closely, but I think there's no bug. Aggregators is just used >>>> > > for global communication. For example, finding max value[1] can be >>>> > > done in only one iteration using MaxValueAggregator. >>>> > > >>>> > > 1. >>>> http://cdn.dejanseo.com.au/wp-content/uploads/2011/06/supersteps.png >>>> > > >>>> > > On Wed, Apr 17, 2013 at 6:27 PM, Steven van Beelen < >>>> [email protected] >>>> > > >>>> > > wrote: >>>> > > > Hello, >>>> > > > >>>> > > > I'm creating my own pagerank in hama for a testing and I think I >>>> found >>>> > a >>>> > > > problem with the AverageAggregator. I'm not sure if it is me or >>>> the the >>>> > > > AverageAggregator class in general, but I believe it just returns >>>> the >>>> > > mean >>>> > > > of all the values instead of the average difference between the >>>> old and >>>> > > new >>>> > > > value as intended. >>>> > > > >>>> > > > For testing, I created my own AbsDiffAggregator and >>>> AverageAggregator >>>> > > > classes, using FloatWritable instead of DoubleWritables. The same >>>> > problem >>>> > > > still occured: I got a mean of all the values in the graph instead >>>> of >>>> > an >>>> > > > average difference. >>>> > > > >>>> > > > Could someone tell me if I'm doing something wrong or what I should >>>> > > provide >>>> > > > to better explain my problem? >>>> > > > >>>> > > > Regards, >>>> > > > Steven van Beelen, Vrije Universiteit of Amsterdam >>>> > > >>>> > > >>>> > > >>>> > > -- >>>> > > Best Regards, Edward J. Yoon >>>> > > @eddieyoon >>>> > > >>>> > >>>> >>> >>> >> -- Best Regards, Edward J. Yoon @eddieyoon
