Sounds like you're looking for a full-text inverted index.  Lucene is a good 
opensource implementation of that.  I believe it has an option for storing the 
original full text as well as the indexes.
--Matt

On May 31, 2011, at 10:50 AM, cs230 wrote:


Hello All,

I am planning to start project where I have to do extensive storage of xml
and text files. On top of that I have to implement efficient algorithm for
searching over thousands or millions of files, and also do some indexes to
make search faster next time. 

I looked into Oracle database but it delivers very poor result. Can I use
Hadoop for this? Which Hadoop project would be best fit for this? 

Is there anything from Google I can use? 

Thanks a lot in advance.
-- 
View this message in context: 
http://old.nabble.com/trying-to-select-technology-tp31743063p31743063.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Reply via email to