I'd like to join this project but can't find "join" button :) Thanks!
Fuad Efendi +1 416-993-2060 http://www.linkedin.com/in/liferay Tokenizer Inc. http://www.tokenizer.ca/ Data Mining, Vertical Search -----Original Message----- From: Julien Nioche [mailto:lists.digitalpeb...@gmail.com] Sent: June-02-11 11:11 AM To: connectors-dev@incubator.apache.org; crawler-comm...@googlegroups.com Subject: CrawlerCommons & ManifoldCF Hi guys, I'd just like to mention Crawler Commons which is a effort between the committers of various crawl-related projects (Nutch, Bixo or Heritrix) to put some basic functionalities in common. We currently have mostly a top level domain finder and a sitemap parser, but are definitely planning to have other things there as well, e.g. robots.txt parser, protocol handler etc... Would you like to get involved? There are quite a few things that the crawler in Manifold could reuse or contribute to. Best, Julien -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com