All,

I'm thinking about using some parts of Nutch in my projects.  We have been 
using Lucene for a while with great success, but I now want to change the way 
we store our content (mostly xml documents).  

In the past we've stored the content in MySQL and now use simple gzipped xml 
files.  Removing the dependency on MySQL has been nice, but dealing with 
millions of small files has created obvious problems.

It looks like using the NDFS and MapFile/ArrayFile's could be part of a good 
solution for us.

I'm also interested in using the MapReduce framework and possibly the Fetcher 
in our applications.

Is there anyone else using just parts of Nutch?  Is it planned that the api 
will stay fairly stable?  Have there been thoughts or discusions about 
breaking parts of Nutch out in to more general toolkits?

--Jason


-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to