Hi Guys,

> Blocker
> ========
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed

+1, thanks to Sami for closing it.

> 
> * NUTCH-353 (pages that serverside forwards will be refetched every
> time) - this was partially fixed in NUTCH-273, but a more complete
> solution would require significant changes to LinkDb. As there are no
> patches implementing this, I left it open, but it's no longer as
> critical as it was before. I propose to move it to "Major" and address
> it in the next release.

+1

> 
> * NUTCH-233 (wrong regular expression hang reduce process for ever) - I
> propose to apply the fix provided by Sean Dean and close this issue for now.

+1

> 
> Critical
> ========
> * NUTCH-436 (Incorrect handling of relative paths when the embedded URL
> path is empty). There is no patch available yet. If someone could
> contribute a patch I'd like to see this fixed before the release.

Looks like Dennis is on this one

> 
> * NUTCH-427 (protocol-smb). This relies on a LGPL library, and it's
> certainly not critical (as this is an optional new feature). I propose
> to change it to Major, and make a decision - do we want another plugin
> like parse-mp3 or parse-rtf, or not.

Let's hold off on this: it's not necessary for 0.9, and I don't think
there's been a bunch of traffic on the list identifying this as critical to
get into the sources for the release

> 
> * NUTCH-381 (Ignore external link not work as expected) - I'll try to
> reproduce it, and if I find an easy fix I'd like to apply it before the
> release.

+1

> 
> * NUTCH-277 (Fetcher dies because of "max. redirects") - I wasn't able
> to reproduce it. If there is no updated information on this I propose to
> close it with "Can't reproduce".

+1, I had to do something similar with NUTCH-258

> 
> * NUTCH-167 (Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE">) -
> there's a patch which I tested in a limited production env. If there are
> no objections I'd like to apply it before the release.

+1

> 
> Major
> =====
> There are 84 major issues, but some of them are either invalid, or
> should be "minor", or no longer apply and should be closed. Please
> review them if you can and provide some comments or recommendations if
> you think you have some new information.

I will spend some time going through JIRA today and see if there's any
issues that I can find that:

1. Have a patch already
2. Sound like something quick, easy, and not so far-reaching across the
entire Nutch API

> 
> 
> One decision also that we need to make is which version of Hadoop should
> be included in the release. Current trunk uses 0.10.1, I have a set of
> production-tested patches that use 0.11.2, and today the Hadoop team
> released 0.12.0 (to be followed shortly by a 0.12.1, most likely in time
> before our release). The most conservative option is to stay with
> 0.10.1, but by the time people start using Nutch this will be a fairly
> old version already. I propose to upgrade to 0.11.2. We could use 0.12.1
> - but in this case with the expectation that we release less than stable
> version of Nutch to be soon followed by a minor stable release ...

I'd agree with the upgrade to 0.11.2, +1


Cheers,
  Chris

P.S. I am going to contact Pitor and coordinate with him: I'd like to be the
release manager for this Nutch release.



Reply via email to