Hi Karl and all ! I’ve been working on the MCF webcrawler component for our Datafari project, and I made some developments that might interest the MCF community. Currently if a website redirects the user with a code 301 or 302 and the « limit to seed is checked », the website (the one pointed by the redirection) won’t be indexed. We added an option « Force the inclusion of redirections », which will override the previous checkbox if the crawl encounters a redirection.
Would you be interested in getting the patch to integrate it into ManifoldCF? The corresponding documentation can be found here: https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1886879745/Web+Connectors Regards, Emeric Bernet-Rollande France Labs – Your knowledge, now Datafari Enterprise Search – Découvrez la version 5 / Discover our version 5 www.datafari.com