> Subject: Re: SitemapProcessor destroyed our CrawlDB
>
> Hi Markus,
>
> What a disaster... do/did you have any crazy rules, replacements and/or
> substitutions present in the urlnormalizer-regex configuration?
> Lewis
>
> On Wed, Jan 17, 2018 at 2:51 AM, wrote:
>
> >
>
Hi Markus,
What a disaster... do/did you have any crazy rules, replacements and/or
substitutions present in the urlnormalizer-regex configuration?
Lewis
On Wed, Jan 17, 2018 at 2:51 AM, wrote:
>
> From: Markus Jelsma
> To: User
> Cc:
> Bcc:
> Date: Wed, 17 Jan 2018 10:51:49 +
> Subject: S
I'll fix NUTCH-2466 this afternoon.
-Original message-
> From:Sebastian Nagel
> Sent: Wednesday 17th January 2018 14:09
> To: user@nutch.apache.org
> Subject: Re: SitemapProcessor destroyed our CrawlDB
>
> It was finally Omkar who brought NUTCH-2442 forward.
>
y bad, thanks!
> Markus
>
> -Original message-
>> From:Sebastian Nagel
>> Sent: Wednesday 17th January 2018 13:32
>> To: user@nutch.apache.org
>> Subject: Re: SitemapProcessor destroyed our CrawlDB
>>
>> Hi Markus,
>>
>> the problem shoul
SitemapProcessor destroyed our CrawlDB
>
> Hi Markus,
>
> the problem should be fixed with NUTCH-2442. It wasn't the case with the
> first version of the
> sitemap processor. It's mandatory to check also the return value of
> job.waitForCompletion(true),
> only c
Hi Markus,
the problem should be fixed with NUTCH-2442. It wasn't the case with the first
version of the
sitemap processor. It's mandatory to check also the return value of
job.waitForCompletion(true),
only checking for exceptions isn't enough!
Sebastian
On 01/17/2018 11:51 AM, Markus Jelsma w
6 matches
Mail list logo