Well, you could use the invertlinks tool to find out from which URL is linked to that one. If you find it, you should be able to reproduce that link using the parserchecker tool, if still links to it.
Do you have any url normalizer plugin active? They should deal with relative paths i think. Op ma 7 okt 2024 om 11:51 schreef Hiran Chaudhuri <[email protected]>: > While testing my protocol plugin I suddenly notice a url that just > cannot get fetched. It looks like > > smb://host/../Folder1/Folder2/Folder3/Filename.extension > > Obviously there is a problem here as no SMB server would ever offer a > share named '..'. > Hence I'd like to know where this link came from. > > If it did not come through the protocol-plugin it must come from a > parser or other source. > What are the ways to discover that? Is there data in the CrawlDB that I > am not yet aware of? > >

