Unfortunately I am not a developper. But as an user of Nutch in a single machine, and very happy with 0.7.2, I think those are good news. And there is a feature I would like to see in the nutch.default.xml: "db.ignore.external.links"; I just don`t know how to do it, as the actual "db.max.outlinks.per.page", from my experience, does`nt give as good results as the former, used in 0.8.1. Tanks Carmmello
----- Original Message ----- From: "Piotr Kosiorowski" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Wednesday, November 15, 2006 11:42 AM Subject: Re: Strategic Direction of Nutch >I agree with Andrzej. On my part if some takes the effort of > preparing patches and testing I as a committer (not very active one > recently) may focus on 7.2 issues and commit the patches. And in > future prepare 7.3 release. > Regards, > Piotr > > On 11/15/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: >> Nitin Borwankar wrote: >> > Hi all, >> > >> > First an intro. I am another Nutch newbie and am finding 0.7.2 to be >> > quite an effective single machine crawler. >> > >> [..] >> > The ability to keep db formats compatible would be nice to allow reuse >> > of existing results but is not necessary. >> > >> >> >> That's probably not going to happen - each branch has specific >> requirements from the db and segment formats, which are incompatible. >> However, given enough interest we could implement converters, even >> bi-directional. >> >> >> > As a potential developer I would like to volunteer for the ongoing >> > maintenance and evolution of 0.7.2 as an effective single machine >> > crawler. >> > >> >> That's excellent! I imagine the procedure to get you involved would be >> something like this: >> >> * start collecting issues related to maintenance, bugfixes or >> improvements of that branch, >> >> * create JIRA issues, plus start collecting patches, tested and ready >> for committing. One of the existing developers will commit them on your >> behalf. >> >> * after a while we would consider giving you committer rights so that >> you could work directly with the code. >> >> >> > Consider this a proposal to maintain two separate versions by >> > continuing >> > bug fix versions of 0.7 until one of two things happen >> > >> > a) 0.8 evolves to something satisfactory for use as also as a single >> > machine search engine and everyone is happy moving to it >> > b) a critical mass of developers steps forward to support the ongoing >> > development of 0.7.2 into say Nutch-lite always and only meant for >> > single machine use. >> > >> I do hope that option a) becomes a reality sooner rather than later. But >> if there is sufficient interest (and enough developers) in developing 0.7 >> branch, then go for it - keeping in mind, though, that eventually these >> code bases will diverge so much that maintaining them will require two >> mostly separate teams ... >> >> -- >> Best regards, >> Andrzej Bialecki <>< >> ___. ___ ___ ___ _ _ __________________________________ >> [__ || __|__/|__||\/| Information Retrieval, Semantic Web >> ___|||__|| \| || | Embedded Unix, System Integration >> http://www.sigram.com Contact: info at sigram dot com >> >> >> > > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.430 / Virus Database: 268.14.5/534 - Release Date: 14/11/2006 > 15:58 > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
