db.max.outlinks.per.page will result in missing links ? Don't want that. I just would want to balance them on a next fetch cycle.
2009/8/26 Fuad Efendi <[email protected]> > You can filter some unnecessary "tail" using UrlFilter; for instance, some > sites may have long forums which you don't need, or shopping cart / process > to checkout pages which they forgot to restrict via robots.txt... > > Check regex-urlfilter.txt.template in /conf > > > Another parameter which equalizes 'per-site' URLs is > db.max.outlinks.per.page=100 (some sites may have 10 links per page, others > - 1000...) > > > -Fuad > http://www.linkedin.com/in/liferay > http://www.tokenizer.org > > > > -----Original Message----- > From: MilleBii [mailto:[email protected]] > Sent: August-25-09 5:48 PM > To: [email protected] > Subject: Limiting number of URL from the same site in a fetch cycle > > I'm wondering if there is a setting by which you can limit the number of > urls per site on a fetch list, not a on a total site. > In this way I could avoid long tails in a fetch list all from the same site > so it takes damn long (5s per URL), I'd like to fetch them on the next > cycle. > > -- > -MilleBii- > > > -- -MilleBii-
