Re: WebGraph: loops job expensive?

Markus Jelsma Tue, 24 Jan 2012 15:51:39 -0800

"It is included in this package for completeness and because there may be a 
better way to perform this function with a different algorithm."


Anyone here to share some techniques i don't know about yet?

> Hi,
> 
> We read that "its benefit to cost ratio is very low" [1]. In our experience
> there is very little cost, so would the benefit be even lower? Running
> countless of link analysis iterations takes many hours but running the
> loops job with a depth of two takes much less time.
> 
> It may be computationally expensive but the iterations of link analysis
> (plus writing back to the whole CrawlDB) consume a _lot_ more I/O time.
> Can anyone (Dennis?) provide some more details and explain why it's
> discouraged in production systems with billions of link nodes?
> 
> [1]: http://wiki.apache.org/nutch/NewScoring#Loops
> 
> Thanks

Re: WebGraph: loops job expensive?

Reply via email to