Hi,

I opened a Pull Request, right here ! 
https://github.com/apache/manifoldcf/pull/149

Regards,

Emeric Bernet-Rollande

France Labs – Your knowledge, now
Datafari Enterprise Search – Découvrez la version 5 / Discover our version 5
www.datafari.com

De : Furkan KAMACI
Envoyé le :lundi 25 septembre 2023 09:28
À : dev@manifoldcf.apache.org
Cc : olivier.tav...@francelabs.com; France Labs
Objet :Re: Contribution to ManifoldCF webcrawler

Hi Emeric,

First of all, thank you for your effort and suggestion. Do you have a Pull
Request for that improvement?

Kind regards,
Furkan Kamaci

On Mon, Sep 25, 2023 at 10:23 AM Emeric Bernet-Rollande <
emeric.ber...@francelabs.com> wrote:

> Hi Karl and all !
>
>
>
> I’ve been working on the MCF webcrawler component for our Datafari
> project, and I made some developments that might interest the MCF community.
>
>
>
> Currently if a website redirects the user with a code 301 or 302 and the
> « limit to seed is checked », the website (the one pointed by the
> redirection) won’t be indexed. We added an option  « Force the inclusion
> of redirections », which will override the previous checkbox if the crawl
> encounters a redirection.
>
>
>
>
>
> Would you be interested in getting the patch to integrate it into
> ManifoldCF? The corresponding documentation can be found here:
> https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1886879745/Web+Connectors
>
>
>
> Regards,
>
>
>
> Emeric Bernet-Rollande
>
>
>
> *France Labs – Your knowledge, now*
>
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com
>
>
>

Reply via email to