Thanks Lewis.
Hi Radim,
If there is anything I can help with please let me know?
Thanks
Rajesh Ramana
-----Original Message-----
From: lewis john mcgibbney [mailto:[email protected]]
Sent: Saturday, October 08, 2011 2:22 PM
To: [email protected]
Subject: Re: Nutch not crawling URLs with spanish accented characters ( ñ)
Hi guys,
I have been watching this thread intently and I am very happy to see that there
is some progress :0)
Radim,
Can I ask that you open a JIRA issue and submit a patch, this way we can not
only track it, but it will also give the community a chance to test and
validate the patch prior to integration into the source.
Thanks
Lewis
On Fri, Oct 7, 2011 at 5:49 PM, Ramanathapuram, Rajesh <
[email protected]> wrote:
> Hi Radim,
>
> Thank you so much for this. I am not familiar with commit process to
> the core.
> Is there someone who can help us get this committed and help resolve
> this issue?
>
> Thanks for all your help.
>
> Rajesh Ramana
>
> -----Original Message-----
> From: Radim Kolar [mailto:[email protected]]
> Sent: Thursday, October 06, 2011 2:18 PM
> To: [email protected]
> Subject: Re: Nutch not crawling URLs with spanish accented characters
> ( ñ)
>
> - The REGEX normalizer transforms the special characters, but fails to
> substitute '%F1' or '%C3%B1' for 'ñ'
> - The fetcher is having trouble interpreting the links with special
> character 'ñ'.
>
> i can add this transformation to basic-url normalizer if somebody is
> willing to commit it.
>
--
*Lewis*