I mean "join button" at http://code.google.com/p/crawler-commons/ I am well familiar with BIXO and Droids; it will be hard to make minor changes in ManifoldCF... although it's possible (without "crawler" part, only "robots rules parser")... -Fuad
-----Original Message----- From: Fuad Efendi [mailto:f...@efendi.ca] Sent: June-02-11 7:05 PM To: connectors-dev@incubator.apache.org; crawler-comm...@googlegroups.com Subject: RE: CrawlerCommons & ManifoldCF I'd like to join this project but can't find "join" button :) Thanks! Fuad Efendi +1 416-993-2060 http://www.linkedin.com/in/liferay Tokenizer Inc. http://www.tokenizer.ca/ Data Mining, Vertical Search -----Original Message----- From: Julien Nioche [mailto:lists.digitalpeb...@gmail.com] Sent: June-02-11 11:11 AM To: connectors-dev@incubator.apache.org; crawler-comm...@googlegroups.com Subject: CrawlerCommons & ManifoldCF Hi guys, I'd just like to mention Crawler Commons which is a effort between the committers of various crawl-related projects (Nutch, Bixo or Heritrix) to put some basic functionalities in common. We currently have mostly a top level domain finder and a sitemap parser, but are definitely planning to have other things there as well, e.g. robots.txt parser, protocol handler etc... Would you like to get involved? There are quite a few things that the crawler in Manifold could reuse or contribute to. Best, Julien -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com