Juha Nurmi <juha.nu...@ahmia.fi> writes: > On 22.04.2014 17:35, George Kadianakis wrote: >> Enjoy GSoC :) > > I will :) > >> BTW, looking again at your proposal, I see that you are going to >> do both popularity tracking and backlinks. > > Yes, another crawler gathers backlinks from the public WWW and I will > start gathering the URL clicks from the users. > >> How are these two technologies going to interact with each other? >> That is, how will the indexer consider the output of those two >> features? > > Django front-end re-sorts the answers from YaCy back-end. > > See https://ahmia.fi/static/gsoc/re_sort.jpg > > I have this idea in mind: https://ahmia.fi/static/gsoc/sorter.py > > The result is sorted according to YaCy result index, number of > backlinks and clicks which are scaled. > > Note the scaling: p_info.backlinks = 1 / (float(index) + 1) etc. > > sum_function = 3.0*self.yacy + 2.0*self.backlinks + 1.0*self.clicks > > where 3, 2 and 1 are test coefficients. I will optimize these and made > a better model if necessary. However, clicks are easily spoofed and > there have to be small coefficient for them. >
That makes sense. BTW, what is the 'yacy' score? Is it just the order that YaCy's indexer chose for each result? Or does YaCy actually expose a score for each result? How is the score derived? Or do you treat it as a blackbox and assume it's the most accurate of backlinks and popularity. Thanks! _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev