Sami Siren wrote: > Hello, > > It has been a while from a previous release (0.8.1) and looking at the > great fixes done in trunk I'd start thinking about baking a new release > soon. > > Looking at the jira roadmaps there are 1 blocking issues (fixing the > license headers) for 0.8.2 and two other blocking issues for 0.9.0 of > which I think NUTCH-233 is safe to put in. >
Agreed. The replacement regex mentioned in the original comment seems safe enough, and simpler. > The top 10 voted issues are currently: > > NUTCH-61 Adaptive re-fetch interval. Detecting umodified content > Well ... I'm of a split mind on this. I can bring this patch up to date and apply it before 0.9.0, if we understand that this is a "0" release ... ;) Otherwise I'd prefer to wait with it right after the release. I would like also to proceed with NUTCH-339 (Fetcher2 patches + plus some changes I made in the meantime), since I'd like to expose the new fetcher to a broader audience, and it doesn't affect the existing implementation. > NUTCH-48 "Did you mean" query enhancement/refignment feature > NUTCH-251 Administration GUI > NUTCH-289 CrawlDatum should store IP address > I'm still not entirely convinced about this - and there is already a mechanism in place to support it if someone really wishes to keep this particular info (CrawlDatum.metaData). > NUTCH-36 Chinese in Nutch > NUTCH-185 XMLParser is configurable xml parser plugin. > NUTCH-59 meta > data support in webdb > NUTCH-92 DistributedSearch incorrectly scores results > NUTCH-68 This is too intrusive to fix just before the release - and needs additional discussion. > NUTCH-68 A > tool to generate arbitrary fetchlists Easy to port this to 0.9.0 - I can do this. > NUTCH-87 Efficient > site-specific crawling for a large number of sites > -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
