Re: nutch slide in lucene presentation
Yonik Seeley wrote: I'd appreciate any suggestions on what to present for Nutch... here is my current slide: some minor suggestions (addition of some jargon or buzzwords): Nutch * Highly customizable Open source web search application * Crawlers * Link-graph database * Document parsers (HTML, word, pdf, etc) * Language + charset detection, MicroFormats Rel-Tag, OpenSearch 1.0, Creative Commons * Utilizes Hadoop (DFS + MapReduce) for massive scalability yet maintaining a good performance on single server Also it would be nice to mention the availability of the 0.9.0 (if it is out by then ;) -- Sami Siren
Re: Issues pending before 0.9 release
Sami Siren wrote: Let's make it the best release ever! :) I have a good feeling about this one. There's some nice marketing material about crawling efficiency [1]. I should probably extend benching to indexing and searching too. [1] http://blog.foofactory.fi/2007/03/twice-speed-half-size.html Yes, I saw this - great stuff :) -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com