Re: RE : Contribution to ManifoldCF webcrawler

2023-09-25 Thread Karl Wright
Thanks. I will have a look at first opportunity. Karl On Mon, Sep 25, 2023 at 7:00 AM Emeric Bernet-Rollande < emeric.ber...@francelabs.com> wrote: > Hi, > > I opened a Pull Request, right here ! > https://github.com/apache/manifoldcf/pull/149 > > Regards, > > Emeric Bernet-Rollande > > France

RE : Contribution to ManifoldCF webcrawler

2023-09-25 Thread Emeric Bernet-Rollande
Hi, I opened a Pull Request, right here ! https://github.com/apache/manifoldcf/pull/149 Regards, Emeric Bernet-Rollande France Labs – Your knowledge, now Datafari Enterprise Search – Découvrez la version 5 / Discover our version 5 www.datafari.com De : Furkan KAMACI Envoyé le :lundi 25

[GitHub] [manifoldcf] Soggard opened a new pull request, #149: Wecrawler connector - Add "Force inclusion of redirections" option

2023-09-25 Thread via GitHub
Soggard opened a new pull request, #149: URL: https://github.com/apache/manifoldcf/pull/149 The "Force the inclusion of redirection” options allows you to include hosts redirected from original seeds. You might want to use this option if the site you are crawling is subject to

Re: Contribution to ManifoldCF webcrawler

2023-09-25 Thread Furkan KAMACI
Hi Emeric, First of all, thank you for your effort and suggestion. Do you have a Pull Request for that improvement? Kind regards, Furkan Kamaci On Mon, Sep 25, 2023 at 10:23 AM Emeric Bernet-Rollande < emeric.ber...@francelabs.com> wrote: > Hi Karl and all ! > > > > I’ve been working on the

Contribution to ManifoldCF webcrawler

2023-09-25 Thread Emeric Bernet-Rollande
Hi Karl and all !   I’ve been working on the MCF webcrawler component for our Datafari project, and I made some developments that might interest the MCF community.   Currently if a website redirects the user with a code 301 or 302 and the «  limit to seed is checked », the website (the one