Re: Nutch Cosine Filter

2016-12-02 Thread Sujen Shah
Hi Thank you for your feedback! Appreciate it. Currently, there are no tools apart from the ones you have already experimented with (topN and generate.min.score) to direct the crawl towards the top scoring urls. I wonder why did the generate.min.score did not work. I looked in to the code and it

Re: Impolite crawling using NUTCH

2016-12-02 Thread Chris Mattmann
Hmm, I’m a little confused here. You were first trying to use white list robots.txt, and now you are talking about Selenium. 1. Did the white list work 2. Are you now asking how to use Nutch and Selenium? Cheers, Chris From: jyoti aditya Date: Thursday, December 1