I have a website eg . www.example.com. Now when I am crawling this using nutch 1.4 problem is that of duplicated crawling . There are a number of pages like www.example.com/s38r84rejkfndn/xyz.aspx . Now this number s38r84rejkfndn keeps on changing every time you visit this page and hence crawler is crawling this again and again as for nutch I this this must be a new url everytime . Please suggest me how to overcome this issue
- nutch crawling issues devang pandey
- RE: nutch crawling issues Markus Jelsma
- Re: nutch crawling issues devang pandey
- RE: nutch crawling issues Markus Jelsma
- Re: nutch crawling issues devang pandey