Hi.

I'm indexing an intranet and I see some pages are fetched twenty times. There are a lot of anchors used so there are a lot of links like the ones in the subject.

Is there some way I can instruct the crawler to discard the part of the url which is after the hash sign? I'm using nutch from trunk a few months back in time.

TIA,


Per.

Reply via email to